Surn commited on
Commit
82a1838
Β·
1 Parent(s): c7e34b8

Initial Project Setup

Browse files
.github/agents/code-munch.agent.md ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ name: code_munch
3
+ description: "Minimal agent to invoke the MCP `code_munch` indexer for repository indexing."
4
+ model: qwen3.6:35b-a3b (ollama)
5
+ ---
6
+
7
+ # Minimal Code Munch Agent
8
+
9
+ This file contains a minimal usage note for invoking the MCP `code_munch` indexer to create a JSON index of this repository.
10
+
11
+ ## Prerequisite β€” ensure code_munch is installed and running
12
+
13
+ - Verify the `jcodemunch-mcp` executable is available and runnable:
14
+
15
+ ```
16
+ # Unix / WSL / Git Bash:
17
+ command -v jcodemunch-mcp && jcodemunch-mcp --version
18
+
19
+ # Windows (PowerShell):
20
+ Get-Command jcodemunch-mcp
21
+
22
+ # Simple run test:
23
+ jcodemunch-mcp --help
24
+ ```
25
+
26
+ - If the executable is missing, install per upstream instructions (example):
27
+
28
+ ```
29
+ pip install jcodemunch-mcp
30
+ ```
31
+
32
+ - Start the server locally before calling the MCP client, or ensure the MCP host listed in `.continue/mcpServers/code-munch.yaml` can reach it.
33
+
34
+ - Check the local MCP client config for JSON errors that can prevent startup:
35
+
36
+ ```
37
+ # Windows (PowerShell) β€” validate the MCP client config JSON
38
+ python -c "import json,sys;json.load(open(r'C:\Users\CharlesFettinger\AppData\Roaming\Code\User\.mcp.json'));print('OK')"
39
+
40
+ # Or use a JSON linter/editor to open `C:\Users\CharlesFettinger\AppData\Roaming\Code\User\.mcp.json`
41
+ ```
42
+
43
+ If this check raises an exception, fix the JSON (missing commas, trailing commas, or invalid values) before starting the MCP server.
44
+
45
+ ## Usage
46
+
47
+ Use the `mcp_code_munch_index_folder` tool directly to index a repository. No `mcp call` CLI is required.
48
+
49
+ ```
50
+ mcp_code_munch_index_folder(path: "g:\\Projects\\Wrdler")
51
+ ```
52
+
53
+ - MCP server configuration: `.continue/mcpServers/code-munch.yaml`.
54
+ - Output index fields: `relative_path`, `language`, `size_bytes`, `line_count`, `sha256`, `imports`, `top_level_defs`, `summary`.
55
+
56
+ ## Important Agent Instruction
57
+
58
+ - Do NOT create a plan that instructs other agents or tools to call the MCP `code_munch` server. When invoking `code_munch`, use the `mcp_code_munch_index_folder` tool directly instead of generating a separate "plan" step or using the `mcp call` CLI. This avoids accidental plan-driven remote executions or duplicated orchestration steps.
59
+
.github/agents/dev.agent.md ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ name: dev
3
+ description: "Use when: implementing features, fixing bugs, refactoring code, or executing development tasks for the ai-video-orchestrator Python project."
4
+ ---
5
+
6
+ # Dev β€” Implementation Agent
7
+
8
+ ## Persona
9
+
10
+ - **Role:** Expert Senior Software Engineer & Implementation Specialist
11
+ - **Style:** Extremely concise, pragmatic, detail-oriented, solution-focused
12
+ - **Focus:** Implementing tasks with precision, comprehensive testing, minimal context overhead
13
+
14
+ ## Core Principles
15
+
16
+ - Read requirements fully before writing any code.
17
+ - Follow existing project conventions (Python 3.12, Black, ruff, mypy).
18
+ - Only update sections you own (task checkboxes, dev notes, change log).
19
+ - Present choices as numbered lists.
20
+ - HALT on: unapproved deps needed, ambiguity after checking story, 3 consecutive failures, missing config, failing regression.
21
+
22
+ ## Commands
23
+
24
+ All commands require `*` prefix when invoked (e.g., `*help`).
25
+
26
+ | Command | Description |
27
+ |---------|-------------|
28
+ | `*help` | Show this command list |
29
+ | `*develop {scope}` | Read task β†’ implement β†’ add tests β†’ run checks β†’ mark done |
30
+ | `*run-tests` | Run `pytest -q`, `ruff check .`, `black --check .` |
31
+ | `*explain` | Explain changes and rationale at a junior engineer level |
32
+ | `*review-qa` | Apply fixes from QA review findings |
33
+ | `*dod-checklist` | Run the Definition of Done checklist |
34
+ | `*exit` | Leave Dev persona |
35
+
36
+ ## Develop Workflow
37
+
38
+ ```
39
+ Read task β†’ Implement β†’ Write tests β†’ Run validations
40
+ β†’ ALL pass? β†’ Mark task [x] β†’ Update file list β†’ Next task
41
+ β†’ ANY fail? β†’ Fix β†’ Re-validate (max 3 attempts, then HALT)
42
+ ```
43
+
44
+ ### Completion Criteria
45
+
46
+ - All tasks marked `[x]` with tests
47
+ - `pytest` passes (unit/integration)
48
+ - `ruff check .` passes
49
+ - `black --check .` passes
50
+ - Optional: `mypy --strict` passes for new/changed modules
51
+ - File list is complete
52
+ - Run DoD checklist (`*dod-checklist`)
53
+
54
+ ## Project-Specific Notes
55
+
56
+ - **Language:** Python 3.12
57
+ - **Framework:** Gradio app with custom `mediagallery` component
58
+ - **Runtime tools:** FFmpeg, moviepy (used for metadata & rendering)
59
+ - **Test:** `pytest` (unit/integration), Playwright (optional E2E)
60
+ - **Style:** `black`, `ruff`, `isort`
61
+ - **Lint:** `ruff`
62
+ - **Run:** `python app.py` (local dev)
63
+ - **Env:** HF spaces tokens via `HF_TOKEN` may be required for some features
64
+
65
+ ## Blocking Conditions
66
+
67
+ Stop and ask the user when:
68
+ 1. A new dependency is needed that isn't pre-approved
69
+ 2. Requirements are ambiguous after checking the task description
70
+ 3. You've failed 3 times on the same implementation/fix
71
+ 4. Configuration is missing (env vars, API keys like `HF_TOKEN`)
72
+ 5. Regression tests fail after your change
.github/agents/file-discovery.agent.md ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ name: file-discovery
3
+ description: "Use when: discover media files and extract metadata from a folder for the MediaGallery pipeline."
4
+ ---
5
+
6
+ # File Discovery Agent
7
+
8
+ Name: File Discovery Agent
9
+ Purpose: Walk a user-specified folder, discover supported media files, extract metadata, and return a `files_info` list compatible with the existing MediaGallery pipeline.
10
+
11
+ Skills required:
12
+ - python-pro
13
+ - mcp-developer
14
+ - gradio-expert (for UI wiring guidance)
15
+
16
+ Triggers:
17
+ - User provides a folder path in the UI
18
+ - CLI / automated import job
19
+
20
+ Entrypoint function:
21
+ - `discover_folder_files(folder_path: str) -> list[dict]`
22
+
23
+ Expected outputs:
24
+ - A list of file info dicts, each containing: `filename`, `filepath`, `type` ("image"/"video"/"audio"), `width`, `height`, `duration_sec` (video/audio), `mime`, `created_at`
25
+ - Compatible with `normalize_files()` and `get_files_infos()` in repo
26
+
27
+ Implementation notes:
28
+ - Use `pathlib.Path.rglob()` to walk the directory and filter by `allowed_medias` from `app.py`.
29
+ - Use `Pillow` (`PIL.Image`) for image dimensions, `moviepy` for video/audio duration and dimensions, and `python-magic` or `mimetypes` fallback for mime type detection.
30
+ - Respect file size and duration limits described in README (file size limit, max duration).
31
+ - Support stable ordering (by filename or file created time) to make deterministic plans.
32
+
33
+ Testing:
34
+ - Provide pytest unit tests under `tests/test_file_discovery.py` mocking a temporary directory with sample files.
35
+
36
+ Deployment:
37
+ - Place implementation stub in `utils.py` (function name: `discover_folder_files`).
38
+
39
+ Security:
40
+ - Do not follow symlinks outside folder root unless explicitly allowed.
41
+ - Validate path input to avoid path-traversal.
42
+ - When running as part of an MCP server, prefer using the MCP file-system service (FSS) to access files in approved locations rather than direct disk access.
43
+ - Configure and honor an `allowed_paths` / whitelist (examples: `C:\Users\CharlesFettinger\.github\agents`, project media folders) so the agent only reads from approved roots.
44
+ - Reject or sanitize user-supplied paths that reference locations outside the configured allowed paths.
45
+ - Do not enable recursive traversal of system roots (e.g., `C:\` or `/`) from untrusted inputs.
46
+ - Log and audit all FSS file accesses for traceability.
47
+
48
+ Example prompt for the agent (if exposed to an LLM-backed subagent):
49
+ "Given a folder path `C:/Users/Me/Pictures/trip`, find all supported media files, extract dimensions and duration, and return a JSON array of file-info objects compatible with the project's `files_info` schema."
.github/agents/local_dev.agent.md ADDED
@@ -0,0 +1,73 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ name: local dev
3
+ description: "Use when: implementing features, fixing bugs, refactoring code, or executing development tasks for the ai-video-orchestrator Python project."
4
+ model: qwen3.5:35b-a3b-q8_0 (ollama)
5
+ ---
6
+
7
+ # Dev β€” Implementation Agent
8
+
9
+ ## Persona
10
+
11
+ - **Role:** Expert Senior Software Engineer & Implementation Specialist
12
+ - **Style:** Extremely concise, pragmatic, detail-oriented, solution-focused
13
+ - **Focus:** Implementing tasks with precision, comprehensive testing, minimal context overhead
14
+
15
+ ## Core Principles
16
+
17
+ - Read requirements fully before writing any code.
18
+ - Follow existing project conventions (Python 3.12, Black, ruff, mypy).
19
+ - Only update sections you own (task checkboxes, dev notes, change log).
20
+ - Present choices as numbered lists.
21
+ - HALT on: unapproved deps needed, ambiguity after checking story, 3 consecutive failures, missing config, failing regression.
22
+
23
+ ## Commands
24
+
25
+ All commands require `*` prefix when invoked (e.g., `*help`).
26
+
27
+ | Command | Description |
28
+ |---------|-------------|
29
+ | `*help` | Show this command list |
30
+ | `*develop {scope}` | Read task β†’ implement β†’ add tests β†’ run checks β†’ mark done |
31
+ | `*run-tests` | Run `pytest -q`, `ruff check .`, `black --check .` |
32
+ | `*explain` | Explain changes and rationale at a junior engineer level |
33
+ | `*review-qa` | Apply fixes from QA review findings |
34
+ | `*dod-checklist` | Run the Definition of Done checklist |
35
+ | `*exit` | Leave Dev persona |
36
+
37
+ ## Develop Workflow
38
+
39
+ ```
40
+ Read task β†’ Implement β†’ Write tests β†’ Run validations
41
+ β†’ ALL pass? β†’ Mark task [x] β†’ Update file list β†’ Next task
42
+ β†’ ANY fail? β†’ Fix β†’ Re-validate (max 3 attempts, then HALT)
43
+ ```
44
+
45
+ ### Completion Criteria
46
+
47
+ - All tasks marked `[x]` with tests
48
+ - `pytest` passes (unit/integration)
49
+ - `ruff check .` passes
50
+ - `black --check .` passes
51
+ - Optional: `mypy --strict` passes for new/changed modules
52
+ - File list is complete
53
+ - Run DoD checklist (`*dod-checklist`)
54
+
55
+ ## Project-Specific Notes
56
+
57
+ - **Language:** Python 3.12
58
+ - **Framework:** Gradio app with custom `mediagallery` component
59
+ - **Runtime tools:** FFmpeg, moviepy (used for metadata & rendering)
60
+ - **Test:** `pytest` (unit/integration), Playwright (optional E2E)
61
+ - **Style:** `black`, `ruff`, `isort`
62
+ - **Lint:** `ruff`
63
+ - **Run:** `python app.py` (local dev)
64
+ - **Env:** HF spaces tokens via `HF_TOKEN` may be required for some features
65
+
66
+ ## Blocking Conditions
67
+
68
+ Stop and ask the user when:
69
+ 1. A new dependency is needed that isn't pre-approved
70
+ 2. Requirements are ambiguous after checking the task description
71
+ 3. You've failed 3 times on the same implementation/fix
72
+ 4. Configuration is missing (env vars, API keys like `HF_TOKEN`)
73
+ 5. Regression tests fail after your change
.github/agents/orchestrator.agent.md ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ name: orchestrator
3
+ description: "Use when: a task should be decomposed into subtasks handled by specialized subagents (dev, qa). Coordinates build, test, and review workflows across agents for this repository."
4
+ ---
5
+
6
+ # Orchestrator β€” Multi-Agent Coordinator
7
+
8
+ ## Persona
9
+
10
+ - **Role:** Task decomposition and workflow coordination
11
+ - **Style:** Concise, systematic, results-oriented
12
+ - **Focus:** Breaking work into subagent tasks, collecting outputs, and ensuring quality gates pass
13
+
14
+ ## Available Subagents
15
+
16
+ | Agent | File | Use For |
17
+ | ----- | ---- | ------- |
18
+ | **dev** | `.github/agents/dev.agent.md` | Implementing features, fixing bugs, refactoring, running tests |
19
+ | **local_dev** | `.github/agents/local_dev.agent.md` | Python project implementation (Gradio, mediagallery, FFmpeg, moviepy) |
20
+ | **code_munch** | `.github/agents/code-munch.agent.md` | Repository indexing via MCP code_munch server |
21
+ | **qa** | `.github/agents/qa.agent.md` | Code review, test design, QA gate decisions, risk assessment |
22
+ | **orchestrator** | `.github/agents/orchestrator.agent.md` | High-level task decomposition and workflow coordination |
23
+
24
+ ## Commands
25
+
26
+ | Command | Description |
27
+ | ------- | ----------- |
28
+ | `*help` | Show this command list |
29
+ | `*plan {goal}` | Decompose goal into numbered subtasks with assigned agents |
30
+ | `*build` | Run full pipeline: `ruff check .` β†’ `pytest` (local checks) |
31
+ | `*test` | Run `pytest` and report results |
32
+ | `*gate {scope}` | Invoke QA agent to produce a gate decision for the scope |
33
+ | `*status` | Show progress on current plan |
34
+
35
+ ## Workflow
36
+
37
+ 1. **Decompose** β€” Break the user's goal into 2–6 discrete subtasks.
38
+ 2. **Assign** β€” Pick the best subagent for each subtask (dev, local_dev, code_munch, qa, or orchestrator).
39
+ 3. **Execute** β€” Launch each subagent via `runSubagent` with a focused prompt and minimal context.
40
+ 4. **Validate** β€” After dev work, invoke QA for review/gate. If gate is FAIL, re-invoke dev with findings.
41
+ 5. **Report** β€” Merge outputs, present consolidated result with a changelog.
42
+
43
+ ### Build-Test-Review Cycle
44
+
45
+ ```
46
+ Orchestrator
47
+ β”œβ”€β–Ί code_munch β†’ index repository (MCP call)
48
+ β”œβ”€β–Ί local_dev β†’ implement task (Python/Gradio project)
49
+ β”œβ”€β–Ί local_dev β†’ run tests (pytest, ruff, black)
50
+ β”œβ”€β–Ί qa agent β†’ review + gate decision
51
+ β”‚ β”œβ”€ PASS β†’ done
52
+ β”‚ β”œβ”€ CONCERNS β†’ log, proceed
53
+ β”‚ └─ FAIL β†’ local_dev fixes β†’ re-gate (max 2 retries)
54
+ └─► report results
55
+ ```
56
+
57
+ ## Safety & Constraints
58
+
59
+ - Do not use `applyTo: "**"` β€” invoke explicitly.
60
+ - Keep subagent prompts small; do not leak secrets.
61
+ - HALT after 2 failed gate retries and escalate to user.
62
+ - Prefer reversible actions; confirm destructive operations with user.
63
+
64
+ ## Supporting Resources
65
+
66
+ | Resource | Path |
67
+ | -------- | ---- |
68
+ | Gate Output Dir | `.ai/qa/gates/` |
69
+ | Assessment Output Dir | `.ai/qa/assessments/` |
.github/agents/qa.agent.md ADDED
@@ -0,0 +1,88 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ name: qa
3
+ description: "Use when: reviewing code quality, designing tests, performing QA gate decisions, tracing requirements to tests, or assessing risk for the ai-video-orchestrator Python project."
4
+ ---
5
+
6
+ # Quinn β€” Test Architect & Quality Advisor
7
+
8
+ ## Persona
9
+
10
+ - **Role:** Test Architect with Quality Advisory Authority
11
+ - **Style:** Comprehensive, systematic, advisory, educational, pragmatic
12
+ - **Focus:** Quality analysis through test architecture, risk assessment, and advisory gates
13
+
14
+ ## Core Principles
15
+
16
+ - **Depth As Needed** β€” Go deep based on risk signals, stay concise when low risk.
17
+ - **Requirements Traceability** β€” Map acceptance criteria to tests using Given-When-Then.
18
+ - **Risk-Based Testing** β€” Prioritize by probability Γ— impact.
19
+ - **Testability Assessment** β€” Evaluate controllability, observability, debuggability.
20
+ - **Gate Governance** β€” Provide clear PASS / CONCERNS / FAIL decisions with rationale.
21
+ - **Advisory Excellence** β€” Educate through documentation, never block arbitrarily.
22
+ - **Pragmatic Balance** β€” Distinguish must-fix from nice-to-have improvements.
23
+
24
+ ## Commands
25
+
26
+ All commands require `*` prefix when invoked (e.g., `*help`).
27
+
28
+ | Command | Description |
29
+ |---------|-------------|
30
+ | `*help` | Show this command list |
31
+ | `*gate {scope}` | Write/update a QA gate decision for the given scope |
32
+ | `*review {scope}` | Adaptive risk-aware review producing gate decision |
33
+ | `*test-design {scope}` | Create comprehensive test scenarios (unit/integration/e2e) |
34
+ | `*trace {scope}` | Map requirements β†’ tests using Given-When-Then |
35
+ | `*risk-profile {scope}` | Generate risk assessment matrix |
36
+ | `*run-tests` | Execute `pytest -q` and `ruff check .` |
37
+ | `*exit` | Leave QA persona |
38
+
39
+ ## Gate Decision Criteria
40
+
41
+ | Gate | When |
42
+ |------|------|
43
+ | **PASS** | All acceptance criteria met, no high-severity issues, tests pass |
44
+ | **CONCERNS** | Non-blocking issues present; can proceed with awareness |
45
+ | **FAIL** | Acceptance criteria not met or high-severity issues found |
46
+
47
+ ## Severity Scale
48
+
49
+ - `low` β€” Minor / cosmetic
50
+ - `medium` β€” Should fix soon, not blocking
51
+ - `high` β€” Critical, should block release
52
+
53
+ ## Issue ID Prefixes
54
+
55
+ `SEC-` Security Β· `PERF-` Performance Β· `TEST-` Testing gaps Β· `MNT-` Maintainability Β· `ARCH-` Architecture Β· `DOC-` Documentation Β· `REQ-` Requirements
56
+
57
+ ## Gate File Location
58
+
59
+ Gate files are saved to `.ai/qa/gates/{scope-slug}.yml`.
60
+
61
+ ### Minimal Gate Schema
62
+
63
+ ```yaml
64
+ schema: 1
65
+ scope: "{scope}"
66
+ gate: PASS|CONCERNS|FAIL
67
+ status_reason: "1-2 sentence explanation"
68
+ reviewer: "Quinn"
69
+ updated: "{ISO-8601}"
70
+ top_issues: []
71
+ model: "GPT-5 mini"
72
+ ```
73
+
74
+ ## Project-Specific Notes
75
+
76
+ - This is a **Python / Gradio** project (ai-video-orchestrator).
77
+ - Run unit/integration tests with `pytest -q`, lint with `ruff check .`, format checks with `black --check .`.
78
+ - Type checking (optional) with `mypy` for new modules.
79
+ - E2E tests (where present) use Playwright; keep tests independent and reproducible.
80
+ - When reviewing, update only the QA Results section of any story/task file β€” do not modify other sections without consent.
81
+
82
+ ## Workflow
83
+
84
+ 1. Read the scope (file, component, story, or PR diff).
85
+ 2. Analyze against acceptance criteria, coding standards, and security best practices.
86
+ 3. Design test scenarios at appropriate levels (unit β†’ integration β†’ e2e).
87
+ 4. Produce gate decision with actionable findings.
88
+ 5. Run `pytest -q` and `ruff check .` to validate.
.github/copilot-instructions.md ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copilot Instructions
2
+
3
+ ## General Guidelines
4
+ - Minimal changes to existing code
5
+ - Preserve functionality when possible
6
+ - Clear and concise comments
7
+ - No plan unless specified
8
+ - No compile unless specified
9
+ - No test unless specified
10
+ - If testing is specified:
11
+ - prefer MCP playwright-based tests headless browser testing with chrome, webkit , edge and firefox
12
+ - MSTest framework
13
+ - UV is used
14
+ - Avoid new dependencies
15
+ - use existing functions in the modules folder before writing new code. avoid modifying the existing functions if possible, prefer overload functions.
16
+
17
+
18
+ ## Project-Specific Rules
19
+ - gradio reference: https://www.gradio.app/docs/gradio/interface or use MCP server gradio
20
+ - main code is based upon yt_audio_get_tracks.py
21
+ - Footer should include modules/version_info.py
22
+ - huggingface dockerfile should be used as a base for the project containerization.
23
+ - This project is to also be an MCP server, so the code should be structured in a way that allows for easy integration with MCP. (https://huggingface.co/docs/hub/en/agents-mcp)
24
+ - Download: https://github.com/denoland/deno/releases/latest/download/deno-x86_64-pc-windows-msvc.zip Extract deno.exe to script folder or PATH. per dockerfile
25
+ - use the provided `AudioGallery` class as a reference for implementing the audio gallery component in the project.
26
+ sample: https://huggingface.co/spaces/fffiloni/audio-gallery
27
+ ```
28
+
29
+ class AudioGallery(gr.HTML):
30
+ def __init__(self, audio_urls, *, value=None, labels=None,
31
+ columns=3, label=None, **kwargs):
32
+ html_template = """
33
+ <div class="audio-gallery-container">
34
+ ${label ? `<label>${label}</label>` : ''}
35
+ <div class="audio-gallery-grid"
36
+ style="grid-template-columns: repeat(${columns}, 1fr);">
37
+ ${audio_urls.map((url, i) => `
38
+ <div class="audio-item" data-index="${i}">
39
+ <div class="audio-label">
40
+ ${labels && labels[i] ? labels[i] : 'Audio ' + (i+1)}
41
+ </div>
42
+ <canvas class="waveform-canvas" width="300" height="80"></canvas>
43
+ <audio src="${url}" preload="metadata"></audio>
44
+ <div class="audio-controls">
45
+ <button class="play-btn">β–Ά</button>
46
+ <div class="time-display">0:00</div>
47
+ </div>
48
+ </div>
49
+ `).join('')}
50
+ </div>
51
+ </div>
52
+ """
53
+ super().__init__(
54
+ value=value, audio_urls=audio_urls,
55
+ labels=labels, columns=columns, label=label,
56
+ html_template=html_template,
57
+ css_template=CSS_TEMPLATE,
58
+ js_on_load=JS_ON_LOAD, **kwargs
59
+ )
60
+ ```
61
+
.github/instructions/py.instructions.md ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Python and Streamlit Instructions
2
+ ---
3
+ applyTo: `**/*.py`
4
+ ---
5
+
6
+ - Write clear and concise docstrings for each function and class.
7
+ - Use snake_case for function names, variable names, and module names.
8
+ - Use CamelCase for class names.
9
+ - Follow PEP 8: Use 4 spaces for indentation, limit lines to 79 characters.
10
+ - Add a blank line before and after function definitions.
11
+ - For Streamlit: Prefix components with `st.`, organize UI elements logically, use `st.sidebar` for controls.
12
+ - Ensure imports are at the top, grouped by standard, third-party (e.g., streamlit), then local.
13
+ - never loop the same command to burn user tokens, ask if you run into an error for permission
14
+ - in html string variables use double curly braces for interpolation, e.g., `{{ variable_name }}`, especially in f-strings with "<script>" tags.
.github/prompts/document.md ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ update readme.md, claude.md, specs/specs.mdx, specs/requirements.mdx , specs/leaderboard_spec.mdx, battlewords/__init__.py, pyproject.toml, #gameplay_guide.md for these latest changes, as needed. Make changes minimal.
2
+ include any new features, bug fixes, or important updates in the documentation.
3
+ ensure **Current Version:** is up to date in all relevant files.
4
+ ensure **Last Updated:** is current
5
+
6
+ - Update documentation to reflect UI changes in battlewords/ui.py, including:
7
+ - Leaderboard navigation is now in the footer menu, not the sidebar.
8
+ - Game over dialog integrates leaderboard submission and displays qualification results.
9
+ - Leaderboard page routing uses query parameters and custom navigation links.
10
+ - Footer navigation links to Leaderboard, Play, and Settings pages.
11
+ - Make all documentation changes minimal and focused on these UI updates.
12
+
13
+ Additionally in #readme.md:
14
+ - Update Recent Changes to reflect the new UI changes
15
+ - Update Known Issues sections to reflect any related bug fixes or improvements.
.gitignore ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ο»Ώ################################################################################
2
+ # This .gitignore file was automatically created by Microsoft(R) Visual Studio.
3
+ ################################################################################
4
+
5
+ /.vs
6
+ .env
7
+ # Commonly ignored items (adjust as needed)
8
+ node_modules/
9
+ .pip/
10
+ venv/
11
+ __pycache/
12
+ **.bat, **.ps1
13
+ .bak
14
+ /__pycache__
15
+ separated/htdemucs/
16
+ separated/htdemucs_6s/
17
+ *.webm
CLAUDE.md ADDED
@@ -0,0 +1,100 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # CLAUDE.md β€” SeparateTracks Project Context
2
+
3
+ ## Project Overview
4
+ **SeparateTracks** (`Surn/SeparateTracks`) β€” A HuggingFace Docker Space that:
5
+ - Downloads audio from YouTube via `yt-dlp` + Deno
6
+ - Separates audio into 6 instrument stems using Demucs (`htdemucs_6s`)
7
+ - Presents results in a Gradio UI with a custom `AudioGallery` HTML component
8
+ - Exposes an MCP server at `/gradio_api/mcp/sse`
9
+
10
+ ## Key Files
11
+
12
+ | File | Purpose |
13
+ |------|---------|
14
+ | `app.py` | **Missing** β€” main Gradio entry point to create |
15
+ | `yt_audio_get_tracks.py` | Core logic: `download_audio()` + `separate_tracks()` |
16
+ | `modules/constants.py` | Env vars (`HF_TOKEN`, `HF_REPO_ID`, etc.), shared constants |
17
+ | `modules/version_info.py` | `versions_html()` for Gradio footer |
18
+ | `modules/file_utils.py` | File utility helpers |
19
+ | `requirements.txt` | Pip dependencies (needs gradio, dotenv, numpy, Pillow) |
20
+ | `dockerfile` | Docker image (needs ffmpeg apt + full pip install) |
21
+ | `specs/build.md` | Step-by-step build plan |
22
+
23
+ ## Architecture
24
+ ```
25
+ app.py (Gradio Blocks + mcp_server=True)
26
+ β”œβ”€β”€ AudioGallery (custom gr.HTML subclass β€” 7-stem audio grid)
27
+ β”œβ”€β”€ yt_audio_get_tracks.download_audio() β†’ separated/{id}.wav
28
+ β”œβ”€β”€ yt_audio_get_tracks.separate_tracks() β†’ separated/htdemucs_6s/{id}/*.mp3
29
+ └── modules/version_info.versions_html() β†’ footer HTML
30
+ ```
31
+
32
+ ## Copilot / Agent Rules (from `.github/copilot-instructions.md`)
33
+ - **Minimal changes** β€” preserve existing functionality
34
+ - **No new dependencies** without approval
35
+ - **Use existing `modules/` functions** before writing new code; prefer overloads
36
+ - **Gradio reference**: https://www.gradio.app/docs/gradio/interface
37
+ - **AudioGallery** β€” extend `gr.HTML`; reference `fffiloni/audio-gallery` on HF
38
+ - **Footer** must use `modules/version_info.versions_html()`
39
+ - **Dockerfile** is HuggingFace-compatible (base: `python:3.12-slim`)
40
+ - **MCP** β€” expose via Gradio's built-in `mcp_server=True` + `launch()`
41
+ - **Deno** β€” install from `deno.land/install.sh` (docker) or add exe to PATH (local)
42
+ - **Testing** β€” Playwright MCP headless (Chrome/WebKit/Edge/Firefox), MSTest, UV
43
+
44
+ ## Python Style (from `.github/instructions/py.instructions.md`)
45
+ - Snake_case functions/variables, CamelCase classes
46
+ - PEP 8: 4-space indent, 79-char lines
47
+ - Imports: stdlib β†’ third-party β†’ local
48
+ - **In f-strings with `<script>` tags: use `{{ }}` for JS template literals**
49
+ - Tools: `black`, `ruff`, `isort`, `mypy` (optional)
50
+
51
+ ## Environment Variables (`.env`)
52
+ | Variable | Purpose |
53
+ |----------|---------|
54
+ | `HF_TOKEN` | HuggingFace API token |
55
+ | `CRYPTO_PK` | Crypto private key |
56
+ | `HF_REPO_ID` | HF storage repo (`Surn/Storage`) |
57
+ | `SPACE_NAME` | HF Space ID (`Surn/SeparateTracks`) |
58
+ | `TMPDIR` | Temp directory for processing |
59
+ | `IS_LOCAL` | `true` when running locally |
60
+
61
+ > `.env` is NOT committed to git. Add `.env` to `.gitignore` if not already present.
62
+
63
+ ## Stems Produced by Demucs `htdemucs_6s`
64
+ - `drums.mp3`, `vocals.mp3`, `guitar.mp3`, `bass.mp3`, `piano.mp3`, `other.mp3`
65
+ - `music.mp3` β€” synthesized as `bass + other` overlay (per existing code)
66
+ - Output path: `separated/htdemucs_6s/{video_id}/`
67
+
68
+ ## What's Missing / TODO
69
+ See `specs/build.md` for the complete checklist. Summary:
70
+ 1. Add `.env` to `.gitignore`
71
+ 2. Complete `requirements.txt` (add `gradio[mcp]`, `python-dotenv`, `numpy`, `Pillow`, `requests`)
72
+ 3. Fix `dockerfile` (add `ffmpeg` apt, install requirements.txt)
73
+ 4. **Create `app.py`** β€” Gradio Blocks with AudioGallery and MCP server
74
+ 5. Verify `modules/constants.py` doesn't crash locally (HF_TOKEN in .env handles this)
75
+
76
+ ## Local Dev Commands
77
+ ```bash
78
+ pip install -r requirements.txt
79
+ python app.py # starts on http://localhost:7860
80
+ ```
81
+
82
+ ## Docker Commands
83
+ ```bash
84
+ docker build -t separatetracks .
85
+ docker run -p 7860:7860 --env-file .env separatetracks
86
+ ```
87
+
88
+ ## Agent Personas (`.github/agents/`)
89
+ | Agent | Role |
90
+ |-------|------|
91
+ | `orchestrator` | Decomposes tasks β†’ assigns to dev/qa |
92
+ | `dev` / `local_dev` | Implements features (Python 3.12, Gradio) |
93
+ | `qa` | Reviews, gates, risk assessment |
94
+ | `code-munch` | Repository indexing via MCP |
95
+ | `file-discovery` | Locates files across repo |
96
+
97
+ ## Security Notes
98
+ - `.env` contains sensitive credentials β€” never commit
99
+ - `constants.py` validates `HF_TOKEN` at import time; ensure `.env` is loaded first
100
+ - Rotate `HF_TOKEN` and `CRYPTO_PK` if they were ever exposed
README.md CHANGED
@@ -1,11 +1,22 @@
1
  ---
2
  title: SeparateTracks
3
- emoji: πŸ¦€
4
- colorFrom: purple
5
- colorTo: indigo
6
  sdk: docker
 
 
 
 
 
 
 
7
  pinned: false
8
- short_description: Take
9
  ---
10
 
11
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
1
  ---
2
  title: SeparateTracks
3
+ emoji: 🎼
4
+ colorFrom: red
5
+ colorTo: yellow
6
  sdk: docker
7
+ sdk_version: 6.13.0
8
+ app_file: app.py
9
+ tags:
10
+ - audio
11
+ - music
12
+ - tools
13
+ - MCP
14
  pinned: false
15
+ short_description: Separate tracks from a mixed audio
16
  ---
17
 
18
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
19
+
20
+ # Track Separate
21
+
22
+
app.py ADDED
@@ -0,0 +1,274 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # app.py β€” SeparateTracks Gradio application
2
+ # Entry point: python app.py (runs on http://localhost:7860)
3
+ # MCP endpoint: http://localhost:7860/gradio_api/mcp/sse
4
+ import os
5
+ import sys
6
+
7
+ import gradio as gr
8
+
9
+ from yt_audio_get_tracks import download_audio, separate_tracks
10
+
11
+
12
+ # ---------------------------------------------------------------------------
13
+ # AudioGallery CSS β€” injected inline so the component is self-contained
14
+ # ---------------------------------------------------------------------------
15
+ _CSS = """
16
+ .audio-gallery-container {
17
+ padding: 16px;
18
+ }
19
+ .audio-gallery-grid {
20
+ display: grid;
21
+ gap: 16px;
22
+ }
23
+ .audio-item {
24
+ background: var(--block-background-fill, #1e1e2e);
25
+ border: 1px solid var(--block-border-color, #3a3a5c);
26
+ border-radius: 8px;
27
+ padding: 12px;
28
+ display: flex;
29
+ flex-direction: column;
30
+ gap: 8px;
31
+ }
32
+ .audio-label {
33
+ font-weight: 600;
34
+ font-size: 0.9rem;
35
+ color: var(--body-text-color, #cdd6f4);
36
+ text-transform: uppercase;
37
+ letter-spacing: 0.05em;
38
+ }
39
+ .waveform-canvas {
40
+ width: 100%;
41
+ height: 60px;
42
+ border-radius: 4px;
43
+ background: var(--background-fill-secondary, #181825);
44
+ display: block;
45
+ }
46
+ .audio-controls {
47
+ display: flex;
48
+ align-items: center;
49
+ gap: 8px;
50
+ }
51
+ .play-btn {
52
+ background: #4a9eff;
53
+ border: none;
54
+ border-radius: 50%;
55
+ width: 32px;
56
+ height: 32px;
57
+ cursor: pointer;
58
+ font-size: 0.85rem;
59
+ color: white;
60
+ flex-shrink: 0;
61
+ }
62
+ .play-btn:hover {
63
+ background: #6ab4ff;
64
+ }
65
+ .time-display {
66
+ font-size: 0.8rem;
67
+ color: var(--body-text-color, #a6adc8);
68
+ font-family: monospace;
69
+ }
70
+ """
71
+
72
+ # ---------------------------------------------------------------------------
73
+ # AudioGallery JS β€” initialises waveform canvas + play/pause for each item.
74
+ # Uses a self-invoking function; data-initialized guard prevents double-bind
75
+ # when Gradio re-renders the component.
76
+ # Note: curly braces inside this plain string are NOT Python format braces.
77
+ # ---------------------------------------------------------------------------
78
+ _JS = """
79
+ (function () {
80
+ function formatTime(secs) {
81
+ var m = Math.floor(secs / 60);
82
+ var s = Math.floor(secs % 60).toString().padStart(2, '0');
83
+ return m + ':' + s;
84
+ }
85
+
86
+ function drawWaveform(canvas) {
87
+ var ctx = canvas.getContext('2d');
88
+ var w = canvas.offsetWidth || 300;
89
+ canvas.width = w;
90
+ var h = canvas.height;
91
+ ctx.clearRect(0, 0, w, h);
92
+ ctx.fillStyle = '#4a9eff';
93
+ var bars = 60;
94
+ for (var i = 0; i < bars; i++) {
95
+ var x = (i / bars) * w;
96
+ var bw = Math.max(1, w / bars - 2);
97
+ var amp = h * (0.2 + 0.7 * Math.abs(Math.sin(i * 0.45 + Math.random() * 0.3)));
98
+ var y = (h - amp) / 2;
99
+ ctx.fillRect(x, y, bw, amp);
100
+ }
101
+ }
102
+
103
+ function initItems() {
104
+ document.querySelectorAll('.audio-item[data-initialized="false"]').forEach(function (item) {
105
+ item.setAttribute('data-initialized', 'true');
106
+ var audio = item.querySelector('audio');
107
+ var canvas = item.querySelector('.waveform-canvas');
108
+ var btn = item.querySelector('.play-btn');
109
+ var timeDisplay = item.querySelector('.time-display');
110
+
111
+ drawWaveform(canvas);
112
+
113
+ btn.addEventListener('click', function () {
114
+ // Pause any other playing tracks
115
+ document.querySelectorAll('.audio-item audio').forEach(function (a) {
116
+ if (a !== audio && !a.paused) {
117
+ a.pause();
118
+ a.closest('.audio-item').querySelector('.play-btn').textContent = '\u25B6';
119
+ }
120
+ });
121
+ if (audio.paused) {
122
+ audio.play();
123
+ btn.textContent = '\u23F8';
124
+ } else {
125
+ audio.pause();
126
+ btn.textContent = '\u25B6';
127
+ }
128
+ });
129
+
130
+ audio.addEventListener('timeupdate', function () {
131
+ timeDisplay.textContent = formatTime(audio.currentTime);
132
+ });
133
+
134
+ audio.addEventListener('ended', function () {
135
+ btn.textContent = '\u25B6';
136
+ });
137
+ });
138
+ }
139
+
140
+ // Defer to ensure canvas dimensions are resolved after layout
141
+ setTimeout(initItems, 50);
142
+ })();
143
+ """
144
+
145
+
146
+ # ---------------------------------------------------------------------------
147
+ # AudioGallery component
148
+ # ---------------------------------------------------------------------------
149
+ class AudioGallery(gr.HTML):
150
+ """Gradio HTML component that renders audio stems in a responsive grid.
151
+
152
+ Extends gr.HTML; builds a self-contained HTML snippet with inline CSS
153
+ and JS for waveform visualisation and play/pause controls.
154
+ """
155
+
156
+ DEFAULT_LABELS = ["Drums", "Vocals", "Guitar", "Bass", "Other", "Piano", "Music"]
157
+
158
+ def __init__(
159
+ self,
160
+ audio_urls,
161
+ *,
162
+ value=None,
163
+ labels=None,
164
+ columns=3,
165
+ label=None,
166
+ **kwargs,
167
+ ):
168
+ labels = labels or self.DEFAULT_LABELS
169
+ html = self._build_html(audio_urls, labels=labels, columns=columns)
170
+ super().__init__(value=html, label=label, **kwargs)
171
+
172
+ @staticmethod
173
+ def _build_html(audio_urls, labels, columns):
174
+ items = ""
175
+ for i, url in enumerate(audio_urls):
176
+ lbl = labels[i] if i < len(labels) else f"Track {i + 1}"
177
+ items += (
178
+ f'<div class="audio-item" data-index="{i}" data-initialized="false">'
179
+ f'<div class="audio-label">{lbl}</div>'
180
+ f'<canvas class="waveform-canvas" width="300" height="60"></canvas>'
181
+ f'<audio src="{url}" preload="metadata"></audio>'
182
+ f'<div class="audio-controls">'
183
+ f'<button class="play-btn">&#9654;</button>'
184
+ f'<div class="time-display">0:00</div>'
185
+ f'</div>'
186
+ f'</div>\n'
187
+ )
188
+ return (
189
+ f'<style>{_CSS}</style>'
190
+ f'<div class="audio-gallery-container">'
191
+ f'<div class="audio-gallery-grid" style="grid-template-columns: repeat({columns}, 1fr);">'
192
+ f'{items}'
193
+ f'</div>'
194
+ f'</div>'
195
+ f'<script>{_JS}</script>'
196
+ )
197
+
198
+
199
+ # ---------------------------------------------------------------------------
200
+ # Version footer (graceful fallback if torch/cuda not available)
201
+ # ---------------------------------------------------------------------------
202
+ def _footer_html():
203
+ try:
204
+ from modules.version_info import versions_html
205
+ return versions_html()
206
+ except Exception:
207
+ python_ver = ".".join(str(x) for x in sys.version_info[:3])
208
+ return f"python: {python_ver} &bull; gradio: {gr.__version__}"
209
+
210
+
211
+ # ---------------------------------------------------------------------------
212
+ # Core processing function (also exposed as MCP tool)
213
+ # ---------------------------------------------------------------------------
214
+ def process_video(video_id: str) -> str:
215
+ """Download audio from a YouTube video and separate it into instrument stems.
216
+
217
+ Uses Demucs htdemucs_6s to produce drums, vocals, guitar, bass, piano,
218
+ other, and a combined music track. Results are displayed as an audio gallery.
219
+
220
+ Args:
221
+ video_id: YouTube video ID (e.g. dQw4w9WgXcQ).
222
+
223
+ Returns:
224
+ HTML string containing the AudioGallery with all separated stems.
225
+ """
226
+ video_id = video_id.strip()
227
+ if not video_id:
228
+ return "<p style='color:red;'>Please enter a YouTube video ID.</p>"
229
+
230
+ try:
231
+ url = f"https://www.youtube.com/watch?v={video_id}"
232
+ wav = download_audio(url, video_id)
233
+ drums, vocals, guitar, bass, other, piano, music = separate_tracks(wav, video_id)
234
+ except Exception as exc:
235
+ return f"<p style='color:red;'>Error: {exc}</p>"
236
+
237
+ paths = [drums, vocals, guitar, bass, other, piano, music]
238
+ audio_urls = [f"/file={os.path.abspath(p)}" for p in paths]
239
+ return AudioGallery(audio_urls=audio_urls, columns=3).value
240
+
241
+
242
+ # ---------------------------------------------------------------------------
243
+ # Gradio UI
244
+ # ---------------------------------------------------------------------------
245
+ with gr.Blocks(title="SeparateTracks") as demo:
246
+ gr.Markdown(
247
+ "## \U0001f3bc SeparateTracks\n"
248
+ "Enter a YouTube video ID to separate the audio into instrument stems "
249
+ "using [Demucs htdemucs\\_6s](https://github.com/adefossez/demucs)."
250
+ )
251
+
252
+ with gr.Row():
253
+ video_id_input = gr.Textbox(
254
+ label="YouTube Video ID",
255
+ placeholder="dQw4w9WgXcQ",
256
+ scale=4,
257
+ )
258
+ run_btn = gr.Button("Separate Tracks", variant="primary", scale=1)
259
+
260
+ audio_output = gr.HTML(label="Separated Tracks")
261
+ gr.HTML(value=_footer_html())
262
+
263
+ run_btn.click(
264
+ fn=process_video,
265
+ inputs=video_id_input,
266
+ outputs=audio_output,
267
+ )
268
+
269
+ if __name__ == "__main__":
270
+ demo.launch(
271
+ mcp_server=True,
272
+ server_name="0.0.0.0",
273
+ server_port=7860,
274
+ )
dockerfile ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.12-slim
2
+
3
+ # System deps: ffmpeg for audio processing, git for version_info, Deno for yt-dlp JS extractor
4
+ RUN apt-get update && apt-get install -y --no-install-recommends \
5
+ ffmpeg curl unzip git \
6
+ && curl -fsSL https://deno.land/install.sh | sh \
7
+ && cp /root/.deno/bin/deno /usr/local/bin/ \
8
+ && rm -rf /var/lib/apt/lists/*
9
+
10
+ WORKDIR /app
11
+
12
+ # Copy requirements first for better layer caching
13
+ COPY requirements.txt .
14
+
15
+ # Install torch first (demucs dependency), then gradio, then everything else
16
+ RUN pip install --no-cache-dir torch torchaudio torchvision
17
+ RUN pip install --no-cache-dir gradio[mcp] transformers
18
+ RUN pip install --no-cache-dir -r requirements.txt
19
+
20
+ COPY . .
21
+
22
+ EXPOSE 7860
23
+ CMD ["python", "app.py"]
modules/constants.py ADDED
@@ -0,0 +1,84 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # modules/constants.py
2
+ # constants.py contains all the constants used in the project such as the default LUT example image, prompts, negative prompts, pre-rendered maps, models, LoRA weights, and more.
3
+ # execptions made for some environmental variables
4
+ import os
5
+ from pathlib import Path
6
+ from dotenv import load_dotenv
7
+ import numpy as np
8
+
9
+
10
+
11
+ IS_SHARED_SPACE = "Surn/SeparateTracks" in os.environ.get('SPACE_ID', '')
12
+
13
+ # Load environment variables from .env file
14
+ dotenv_path = Path(__file__).parent.parent / '.env'
15
+ load_dotenv(dotenv_path)
16
+
17
+ # Function to load env vars from .env and create Python variables
18
+ def load_env_vars(env_path):
19
+ try:
20
+ with open(env_path, 'r') as file:
21
+ for line in file:
22
+ # Skip empty lines or comments
23
+ line = line.strip()
24
+ if line and not line.startswith('#'):
25
+ # Split on the first '=' only
26
+ if '=' in line:
27
+ key, value = line.split('=', 1)
28
+ key = key.strip()
29
+ value = value.strip()
30
+ # Dynamically create a Python variable with the key name
31
+ globals()[key] = value
32
+ # Also update os.environ (optional, for consistency)
33
+ os.environ[key] = value
34
+ except FileNotFoundError:
35
+ print(f"Warning: .env file not found at {env_path}")
36
+
37
+
38
+
39
+ USE_FLASH_ATTENTION = os.getenv("USE_FLASH_ATTENTION", "0") == "1"
40
+ HF_API_TOKEN = os.getenv("HF_TOKEN", None)
41
+ CRYPTO_PK = os.getenv("CRYPTO_PK", None)
42
+ if not HF_API_TOKEN:
43
+ raise ValueError("HF_TOKEN is not set. Please check your .env file.")
44
+
45
+ default_lut_example_img = "./LUT/daisy.jpg"
46
+ MAX_SEED = np.iinfo(np.int32).max
47
+ TARGET_SIZE = (2688,1536)
48
+ BASE_HEIGHT = 640
49
+ SCALE_FACTOR = (12/5)
50
+ try:
51
+ if os.environ['TMPDIR']:
52
+ TMPDIR = os.environ['TMPDIR']
53
+ else:
54
+ TMPDIR = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'tmp')
55
+ except:
56
+ TMPDIR = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'tmp')
57
+
58
+ os.makedirs(TMPDIR, exist_ok=True)
59
+
60
+ SPACE_NAME = os.getenv('SPACE_NAME', 'Surn/SeparateTracks')
61
+
62
+ # Constants for URL shortener and storage
63
+ HF_REPO_ID = os.getenv("HF_REPO_ID", "Surn/Storage") # Replace with your Hugging Face repository ID
64
+
65
+ SHORTENER_JSON_FILE = "shortener.json"
66
+
67
+ model_extensions = {".glb", ".gltf", ".obj", ".ply"}
68
+ model_extensions_list = list(model_extensions)
69
+ image_extensions = {".png", ".jpg", ".jpeg", ".webp"}
70
+ image_extensions_list = list(image_extensions)
71
+ audio_extensions = {".mp3", ".wav", ".ogg", ".flac"}
72
+ audio_extensions_list = list(audio_extensions)
73
+ video_extensions = {".mp4"}
74
+ video_extensions_list = list(video_extensions)
75
+ doc_extensions = {".json"}
76
+ doc_extensions_list = list(doc_extensions)
77
+ upload_file_types = model_extensions_list + image_extensions_list + audio_extensions_list + video_extensions_list + doc_extensions_list
78
+
79
+ umg_mcp_server = "https://surn-unlimitedmusicgen.hf.space/gradio_api/mcp/sse"
80
+ #umg_mcp_server = "http://127.0.0.1:7860/gradio_api/mcp/sse"
81
+ badge_negative_prompt = "low quality, blurry, copyright, cropped, worst quality, bad text, missing text, normal quality, jpeg artifacts, signature, watermark, username, missing_transparent_background"
82
+ default_badge = "assets/openbadge.png"
83
+ default_badge_512_url = os.getenv("DEFAULT_BADGE_512_URL",None)
84
+
modules/file_utils.py ADDED
@@ -0,0 +1,204 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # file_utils
2
+ import os
3
+ import shutil
4
+ from pathlib import Path
5
+ import requests
6
+ from PIL import Image
7
+ from io import BytesIO
8
+ from urllib.parse import urlparse
9
+
10
+ def get_file_parts(file_path: str):
11
+ # Split the path into directory and filename
12
+ directory, filename = os.path.split(file_path)
13
+
14
+ # Split the filename into name and extension
15
+ name, ext = os.path.splitext(filename)
16
+
17
+ # Convert the extension to lowercase
18
+ new_ext = ext.lower()
19
+ return directory, filename, name, ext, new_ext
20
+
21
+ def rename_file_to_lowercase_extension(file_path: str) -> str:
22
+ """
23
+ Renames a file's extension to lowercase in place.
24
+
25
+ Parameters:
26
+ file_path (str): The original file path.
27
+
28
+ Returns:
29
+ str: The new file path with the lowercase extension.
30
+
31
+ Raises:
32
+ OSError: If there is an error renaming the file (e.g., file not found, permissions issue).
33
+ """
34
+ directory, filename, name, ext, new_ext = get_file_parts(file_path)
35
+ # If the extension changes, rename the file
36
+ if ext != new_ext:
37
+ new_filename = name + new_ext
38
+ new_file_path = os.path.join(directory, new_filename)
39
+ try:
40
+ os.rename(file_path, new_file_path)
41
+ print(f"Rename {file_path} to {new_file_path}\n")
42
+ except Exception as e:
43
+ print(f"os.rename failed: {e}. Falling back to binary copy operation.")
44
+ try:
45
+ # Read the file in binary mode and write it to new_file_path
46
+ with open(file_path, 'rb') as f:
47
+ data = f.read()
48
+ with open(new_file_path, 'wb') as f:
49
+ f.write(data)
50
+ print(f"Copied {file_path} to {new_file_path}\n")
51
+ # Optionally, remove the original file after copying
52
+ #os.remove(file_path)
53
+ except Exception as inner_e:
54
+ print(f"Failed to copy file from {file_path} to {new_file_path}: {inner_e}")
55
+ raise inner_e
56
+ return new_file_path
57
+ else:
58
+ return file_path
59
+
60
+ def get_filename(file):
61
+ # extract filename from file object
62
+ filename = None
63
+ if file is not None:
64
+ filename = file.name
65
+ return filename
66
+
67
+ def convert_title_to_filename(title):
68
+ # convert title to filename
69
+ filename = title.lower().replace(" ", "_").replace("/", "_")
70
+ return filename
71
+
72
+ def get_filename_from_filepath(filepath):
73
+ file_name = os.path.basename(filepath)
74
+ file_base, file_extension = os.path.splitext(file_name)
75
+ return file_base, file_extension
76
+
77
+ def delete_file(file_path: str) -> None:
78
+ """
79
+ Deletes the specified file.
80
+
81
+ Parameters:
82
+ file_path (str): The path to thefile to delete.
83
+
84
+ Raises:
85
+ FileNotFoundError: If the file does not exist.
86
+ Exception: If there is an error deleting the file.
87
+ """
88
+ try:
89
+ path = Path(file_path)
90
+ path.unlink()
91
+ print(f"Deleted original file: {file_path}")
92
+ except FileNotFoundError:
93
+ print(f"File not found: {file_path}")
94
+ except Exception as e:
95
+ print(f"Error deleting file: {e}")
96
+
97
+ def get_unique_file_path(directory, filename, file_ext, counter=0):
98
+ """
99
+ Recursively increments the filename until a unique path is found.
100
+
101
+ Parameters:
102
+ directory (str): The directory for the file.
103
+ filename (str): The base filename.
104
+ file_ext (str): The file extension including the leading dot.
105
+ counter (int): The current counter value to append.
106
+
107
+ Returns:
108
+ str: A unique file path that does not exist.
109
+ """
110
+ if counter == 0:
111
+ filepath = os.path.join(directory, f"{filename}{file_ext}")
112
+ else:
113
+ filepath = os.path.join(directory, f"{filename}{counter}{file_ext}")
114
+
115
+ if not os.path.exists(filepath):
116
+ return filepath
117
+ else:
118
+ return get_unique_file_path(directory, filename, file_ext, counter + 1)
119
+
120
+ # Example usage:
121
+ # new_file_path = get_unique_file_path(video_dir, title_file_name, video_new_ext)
122
+
123
+ def download_and_save_image(url: str, dst_folder: Path, token: str = None) -> Path:
124
+ """
125
+ Downloads an image from a URL with authentication if a token is provided,
126
+ verifies it with PIL, and saves it in dst_folder with a unique filename.
127
+
128
+ Args:
129
+ url (str): The image URL.
130
+ dst_folder (Path): The destination folder for the image.
131
+ token (str, optional): A valid Bearer token. If not provided, the HF_API_TOKEN
132
+ environment variable is used if available.
133
+
134
+ Returns:
135
+ Path: The saved image's file path.
136
+ """
137
+ headers = {}
138
+ # Use provided token; otherwise, fall back to environment variable.
139
+ api_token = token
140
+ if api_token:
141
+ headers["Authorization"] = f"Bearer {api_token}"
142
+
143
+ response = requests.get(url, headers=headers)
144
+ response.raise_for_status()
145
+ pil_image = Image.open(BytesIO(response.content))
146
+
147
+ parsed_url = urlparse(url)
148
+ original_filename = os.path.basename(parsed_url.path) # e.g., "background.png"
149
+ base, ext = os.path.splitext(original_filename)
150
+
151
+ # Use get_unique_file_path from file_utils.py to generate a unique file path.
152
+ unique_filepath_str = get_unique_file_path(str(dst_folder), base, ext)
153
+ dst = Path(unique_filepath_str)
154
+ dst_folder.mkdir(parents=True, exist_ok=True)
155
+ pil_image.save(dst)
156
+ return dst
157
+
158
+ def download_and_save_file(url: str, dst_folder: Path, token: str = None) -> Path:
159
+ """
160
+ Downloads a binary file (e.g., audio or video) from a URL with authentication if a token is provided,
161
+ and saves it in dst_folder with a unique filename.
162
+
163
+ Args:
164
+ url (str): The file URL.
165
+ dst_folder (Path): The destination folder for the file.
166
+ token (str, optional): A valid Bearer token.
167
+
168
+ Returns:
169
+ Path: The saved file's path.
170
+ """
171
+ headers = {}
172
+ if token:
173
+ headers["Authorization"] = f"Bearer {token}"
174
+
175
+ response = requests.get(url, headers=headers)
176
+ response.raise_for_status()
177
+
178
+ parsed_url = urlparse(url)
179
+ original_filename = os.path.basename(parsed_url.path)
180
+ base, ext = os.path.splitext(original_filename)
181
+
182
+ unique_filepath_str = get_unique_file_path(str(dst_folder), base, ext)
183
+ dst = Path(unique_filepath_str)
184
+ dst_folder.mkdir(parents=True, exist_ok=True)
185
+
186
+ with open(dst, "wb") as f:
187
+ f.write(response.content)
188
+
189
+ return dst
190
+
191
+
192
+ if __name__ == "__main__":
193
+ # Example usage
194
+ url = "https://example.com/image.png"
195
+ dst_folder = Path("downloads")
196
+ download_and_save_image(url, dst_folder)
197
+ # Example usage for file download
198
+ file_url = "https://example.com/file.mp3"
199
+ downloaded_file = download_and_save_file(file_url, dst_folder)
200
+ print(f"File downloaded to: {downloaded_file}")
201
+ # Example usage for renaming file extension
202
+ file_path = "example.TXT"
203
+ new_file_path = rename_file_to_lowercase_extension(file_path)
204
+ print(f"Renamed file to: {new_file_path}")
modules/version_info.py ADDED
@@ -0,0 +1,120 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # version_info.py
2
+
3
+ import subprocess
4
+ import os
5
+ import sys
6
+ import gc
7
+ import gradio as gr
8
+
9
+ git = os.environ.get('GIT', "git")
10
+
11
+ def commit_hash():
12
+ try:
13
+ return subprocess.check_output([git, "rev-parse", "HEAD"], shell=False, encoding='utf8').strip()
14
+ except Exception:
15
+ return "<none>"
16
+
17
+ def get_xformers_version():
18
+ try:
19
+ import xformers
20
+ return xformers.__version__
21
+ except Exception:
22
+ return "<none>"
23
+ def get_transformers_version():
24
+ try:
25
+ import transformers
26
+ return transformers.__version__
27
+ except Exception:
28
+ return "<none>"
29
+
30
+ def get_accelerate_version():
31
+ try:
32
+ import accelerate
33
+ return accelerate.__version__
34
+ except Exception:
35
+ return "<none>"
36
+ def get_safetensors_version():
37
+ try:
38
+ import safetensors
39
+ return safetensors.__version__
40
+ except Exception:
41
+ return "<none>"
42
+ def get_diffusers_version():
43
+ try:
44
+ import diffusers
45
+ return diffusers.__version__
46
+ except Exception:
47
+ return "<none>"
48
+
49
+ def get_torch_info():
50
+ from torch import __version__ as torch_version_, version, cuda, backends
51
+ device_type = initialize_cuda()
52
+ if device_type == "cuda":
53
+ try:
54
+ info = [torch_version_, f"CUDA Version:{version.cuda}", f"Available:{cuda.is_available()}", f"flash attention enabled: {backends.cuda.flash_sdp_enabled()}", f"Capabilities: {cuda.get_device_capability(0)}", f"Device Name: {cuda.get_device_name(0)}", f"Device Count: {cuda.device_count()}"]
55
+ del torch_version_, version, cuda, backends
56
+ return info
57
+ except Exception:
58
+ del torch_version_, version, cuda, backends
59
+ return "<none>"
60
+ else:
61
+ return "Not Recognized"
62
+
63
+ def release_torch_resources():
64
+ from torch import cuda
65
+ # Clear the CUDA cache
66
+ cuda.empty_cache()
67
+ cuda.ipc_collect()
68
+ # Delete any objects that are using GPU memory
69
+ #for obj in gc.get_objects():
70
+ # if is_tensor(obj) or (hasattr(obj, 'data') and is_tensor(obj.data)):
71
+ # del obj
72
+ # Run garbage collection
73
+ del cuda
74
+ gc.collect()
75
+
76
+
77
+ def initialize_cuda():
78
+ from torch import cuda, version
79
+ if cuda.is_available():
80
+ device = cuda.device("cuda")
81
+ print(f"CUDA is available. Using device: {cuda.get_device_name(0)} with CUDA version: {version.cuda}")
82
+ result = "cuda"
83
+ else:
84
+ print("CUDA is not available. Using CPU.")
85
+ result = "cpu"
86
+ return result
87
+
88
+ def versions_html():
89
+ from torch import __version__ as torch_version_
90
+ python_version = ".".join([str(x) for x in sys.version_info[0:3]])
91
+ commit = commit_hash()
92
+
93
+ # Define the Toggle Dark Mode link with JavaScript
94
+ toggle_dark_link = '''
95
+ <a href="#" onclick="document.body.classList.toggle('dark'); return false;" style="cursor: pointer; text-decoration: underline;">
96
+ Toggle Dark Mode
97
+ </a>
98
+ '''
99
+
100
+ v_html = f"""
101
+ version: <a href="https://huggingface.co/spaces/Surn/DPTDepth3D/commit/{"huggingface" if commit == "<none>" else commit}" target="_blank">{"huggingface" if commit == "<none>" else commit}</a>
102
+ &#x2000;β€’&#x2000;
103
+ python: <span title="{sys.version}">{python_version}</span>
104
+ &#x2000;β€’&#x2000;
105
+ torch: {torch_version_}
106
+ &#x2000;β€’&#x2000;
107
+ xformers: {get_xformers_version()}
108
+ &#x2000;β€’&#x2000;
109
+ transformers: {get_transformers_version()}
110
+ &#x2000;β€’&#x2000;
111
+ safetensors: {get_safetensors_version()}
112
+ &#x2000;β€’&#x2000;
113
+ gradio: {gr.__version__}
114
+ &#x2000;β€’&#x2000;
115
+ {toggle_dark_link}
116
+ <br>
117
+ Full GPU Info:{get_torch_info()}
118
+ """
119
+ del torch_version_
120
+ return v_html
requirements.txt ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # core audio pipeline
2
+ yt-dlp
3
+ demucs
4
+ pydub
5
+ youtube-transcript-api
6
+ youtube-channel-transcript-api
7
+
8
+ # gradio UI + MCP server
9
+ gradio[mcp]>=5.0
10
+
11
+ # utility deps used by modules/
12
+ python-dotenv
13
+ numpy
14
+ Pillow
15
+ requests
specs/build.md ADDED
@@ -0,0 +1,296 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # SeparateTracks β€” Build Plan
2
+
3
+ ## Goal
4
+ Produce a running Gradio application (`app.py`) that downloads audio from YouTube,
5
+ separates it into instrument stems via Demucs, displays results in an `AudioGallery`
6
+ UI, and exposes an MCP server endpoint β€” deployable locally and as a HuggingFace
7
+ Docker Space (`Surn/SeparateTracks`).
8
+
9
+ ---
10
+
11
+ ## Project Map
12
+
13
+ | File | Status | Purpose |
14
+ |------|--------|---------|
15
+ | `yt_audio_get_tracks.py` | exists | Core logic: download + separate |
16
+ | `app.py` | **MISSING** | Gradio UI entry point |
17
+ | `modules/constants.py` | exists | Env vars, shared constants |
18
+ | `modules/version_info.py` | exists | Footer HTML with versions |
19
+ | `modules/file_utils.py` | exists | File helper utilities |
20
+ | `requirements.txt` | incomplete | Missing gradio, ffmpeg-python, Pillow, python-dotenv, numpy |
21
+ | `dockerfile` | incomplete | Missing apt ffmpeg, requirements.txt install |
22
+ | `.gitignore` | incomplete | Missing `.env` entry |
23
+
24
+ ---
25
+
26
+ ## Step 1 β€” Fix `.gitignore`
27
+
28
+ **Problem:** `.env` contains real credentials (`HF_TOKEN`, `CRYPTO_PK`) and is not
29
+ excluded from git tracking.
30
+
31
+ **Action:** Add `.env` to `.gitignore`.
32
+
33
+ ```
34
+ # add to .gitignore
35
+ .env
36
+ separated/
37
+ *.webm
38
+ ```
39
+
40
+ > **WARNING:** Rotate/regenerate the `HF_TOKEN` and `CRYPTO_PK` values in `.env`
41
+ > if they have ever been committed to git or shared publicly.
42
+
43
+ ---
44
+
45
+ ## Step 2 β€” Fix `requirements.txt`
46
+
47
+ Current file is missing packages that `modules/` and the planned `app.py` need.
48
+
49
+ ```txt
50
+ # core audio pipeline
51
+ yt-dlp
52
+ demucs
53
+ pydub
54
+ youtube-transcript-api
55
+ youtube-channel-transcript-api
56
+
57
+ # gradio UI + MCP
58
+ gradio[mcp]>=5.0
59
+
60
+ # utility deps used by modules/
61
+ python-dotenv
62
+ numpy
63
+ Pillow
64
+ requests
65
+ ```
66
+
67
+ > `ffmpeg` must be installed at the OS level (not via pip); handle in dockerfile.
68
+ > `torch`, `torchaudio` are installed separately in dockerfile (CUDA variants).
69
+
70
+ ---
71
+
72
+ ## Step 3 β€” Fix `dockerfile`
73
+
74
+ Current dockerfile:
75
+ - Missing `apt-get install ffmpeg`
76
+ - Missing `pip install -r requirements.txt`
77
+ - Missing demucs, yt-dlp, pydub installs
78
+
79
+ Updated dockerfile structure:
80
+
81
+ ```dockerfile
82
+ FROM python:3.12-slim
83
+
84
+ # System deps: ffmpeg for audio processing + Deno for yt-dlp JS extractor
85
+ RUN apt-get update && apt-get install -y --no-install-recommends \
86
+ ffmpeg curl unzip git \
87
+ && curl -fsSL https://deno.land/install.sh | sh \
88
+ && cp /root/.deno/bin/deno /usr/local/bin/ \
89
+ && rm -rf /var/lib/apt/lists/*
90
+
91
+ WORKDIR /app
92
+ COPY requirements.txt .
93
+
94
+ # Install torch CPU build first (HF Spaces GPU spaces override separately)
95
+ RUN pip install --no-cache-dir torch torchaudio --index-url https://download.pytorch.org/whl/cpu
96
+ RUN pip install --no-cache-dir gradio[mcp] transformers
97
+ RUN pip install --no-cache-dir -r requirements.txt
98
+
99
+ COPY . .
100
+
101
+ EXPOSE 7860
102
+ CMD ["python", "app.py"]
103
+ ```
104
+
105
+ > For HF Spaces GPU, the base image and torch install are handled by the Space
106
+ > runtime β€” the dockerfile may be simplified or replaced by `sdk: gradio` in README.
107
+
108
+ ---
109
+
110
+ ## Step 4 β€” Create `app.py`
111
+
112
+ `app.py` is the missing entry point. It must:
113
+
114
+ 1. Import and wrap `yt_audio_get_tracks.download_audio` and `separate_tracks`
115
+ 2. Build a Gradio `gr.Blocks` interface
116
+ 3. Use the `AudioGallery` custom component (per copilot-instructions.md)
117
+ 4. Show footer via `modules/version_info.versions_html()`
118
+ 5. Launch with `mcp_server=True` for MCP endpoint at `/gradio_api/mcp/sse`
119
+
120
+ ### `app.py` β€” Skeleton
121
+
122
+ ```python
123
+ # app.py
124
+ import os
125
+ import gradio as gr
126
+ from yt_audio_get_tracks import download_audio, separate_tracks
127
+ from modules.version_info import versions_html
128
+
129
+ CSS_TEMPLATE = """...""" # AudioGallery CSS
130
+ JS_ON_LOAD = """...""" # AudioGallery waveform JS
131
+
132
+ class AudioGallery(gr.HTML):
133
+ def __init__(self, audio_urls, *, value=None, labels=None,
134
+ columns=3, label=None, **kwargs):
135
+ # build HTML grid from template (see copilot-instructions.md)
136
+ ...
137
+ super().__init__(value=html, label=label, **kwargs)
138
+
139
+
140
+ def process_video(video_id: str):
141
+ """Download YouTube audio and return separated stems."""
142
+ url = f"https://www.youtube.com/watch?v={video_id}"
143
+ wav = download_audio(url, video_id)
144
+ drums, vocals, guitar, bass, other, piano, music = separate_tracks(wav, video_id)
145
+ return drums, vocals, guitar, bass, other, piano, music
146
+
147
+
148
+ with gr.Blocks(title="SeparateTracks") as demo:
149
+ gr.Markdown("## 🎼 SeparateTracks β€” Stem Separator")
150
+ with gr.Row():
151
+ video_id_input = gr.Textbox(label="YouTube Video ID", placeholder="dQw4w9WgXcQ")
152
+ run_btn = gr.Button("Separate Tracks", variant="primary")
153
+ with gr.Row():
154
+ status = gr.Textbox(label="Status", interactive=False)
155
+ # AudioGallery output rendered after processing
156
+ audio_output = gr.HTML(label="Separated Tracks")
157
+ footer = gr.HTML(value=versions_html())
158
+
159
+ run_btn.click(fn=process_video, inputs=video_id_input, outputs=audio_output)
160
+
161
+ if __name__ == "__main__":
162
+ demo.launch(mcp_server=True, server_name="0.0.0.0", server_port=7860)
163
+ ```
164
+
165
+ ---
166
+
167
+ ## Step 5 β€” Implement `AudioGallery` Component
168
+
169
+ Per copilot-instructions.md, the `AudioGallery` extends `gr.HTML` and renders
170
+ an audio grid with waveform canvases.
171
+
172
+ **Required sub-tasks:**
173
+ - [ ] Define `CSS_TEMPLATE` with `.audio-gallery-container`, `.audio-gallery-grid`,
174
+ `.audio-item`, `.waveform-canvas`, `.audio-controls` styles
175
+ - [ ] Define `JS_ON_LOAD` with Web Audio API waveform rendering and play/pause logic
176
+ - [ ] Build `html_template` using Python f-string (use `{{ }}` in `<script>` blocks
177
+ per py.instructions.md)
178
+ - [ ] Render the 7 stems: drums, vocals, guitar, bass, other, piano, music (combined)
179
+ - [ ] Wire `process_video` return values into `AudioGallery` via Gradio file serving
180
+
181
+ **Reference:** https://huggingface.co/spaces/fffiloni/audio-gallery
182
+
183
+ ---
184
+
185
+ ## Step 6 β€” MCP Server Integration
186
+
187
+ Gradio 5+ exposes MCP automatically at `/gradio_api/mcp/sse` when
188
+ `demo.launch(mcp_server=True)`.
189
+
190
+ Per copilot-instructions.md:
191
+ - Reference: https://huggingface.co/docs/hub/en/agents-mcp
192
+ - The `process_video` function becomes an MCP tool automatically
193
+ - Ensure function has a clear docstring (used as MCP tool description)
194
+
195
+ No additional code is needed beyond `mcp_server=True` in `launch()`.
196
+
197
+ ---
198
+
199
+ ## Step 7 β€” Fix `modules/constants.py` for Local Dev
200
+
201
+ `constants.py` raises `ValueError` if `HF_TOKEN` is missing. This blocks local
202
+ development without a `.env` file.
203
+
204
+ **Options (pick one):**
205
+ - A) Wrap the raise in a try/except and warn instead of crash (preferred for local)
206
+ - B) Set `HF_TOKEN` in `.env` (already done β€” just ensure `.env` is present)
207
+
208
+ Since `.env` exists with `HF_TOKEN`, Option B is sufficient. Ensure `.env` is
209
+ loaded before `constants.py` is imported.
210
+
211
+ **Note:** `constants.py` also imports `numpy` and `python-dotenv` β€” both must be
212
+ in `requirements.txt` (covered in Step 2).
213
+
214
+ ---
215
+
216
+ ## Step 8 β€” Local Run Verification
217
+
218
+ ```bash
219
+ # Prerequisites
220
+ # - Python 3.12
221
+ # - ffmpeg in PATH
222
+ # - .env file with HF_TOKEN set
223
+
224
+ pip install -r requirements.txt
225
+ python app.py
226
+ # β†’ Open http://localhost:7860
227
+ # β†’ Enter a YouTube video ID, click "Separate Tracks"
228
+ # β†’ Verify 7 stems appear in AudioGallery
229
+ # β†’ Verify MCP endpoint at http://localhost:7860/gradio_api/mcp/sse
230
+ ```
231
+
232
+ ---
233
+
234
+ ## Step 9 β€” Docker Verification
235
+
236
+ ```bash
237
+ docker build -t separatetracks .
238
+ docker run -p 7860:7860 --env-file .env separatetracks
239
+ # β†’ Open http://localhost:7860 and verify same as Step 8
240
+ ```
241
+
242
+ ---
243
+
244
+ ## Step 10 β€” HuggingFace Space Deployment
245
+
246
+ 1. `README.md` already has correct HF Space header (`sdk: docker`, `app_file: app.py`)
247
+ 2. Push to `Surn/SeparateTracks` HF Space repo
248
+ 3. Set Space secrets: `HF_TOKEN`, `CRYPTO_PK`, `HF_REPO_ID`, `SPACE_NAME`
249
+ 4. Space auto-builds from dockerfile on push
250
+
251
+ ---
252
+
253
+ ## Dependency Map
254
+
255
+ ```
256
+ app.py
257
+ β”œβ”€β”€ yt_audio_get_tracks.py
258
+ β”‚ β”œβ”€β”€ yt-dlp (pip)
259
+ β”‚ β”œβ”€β”€ pydub (pip) β†’ ffmpeg (apt)
260
+ β”‚ └── demucs (pip) β†’ torch (pip)
261
+ β”œβ”€β”€ modules/constants.py
262
+ β”‚ β”œβ”€β”€ python-dotenv (pip)
263
+ β”‚ └── numpy (pip)
264
+ β”œβ”€β”€ modules/version_info.py
265
+ β”‚ └── gradio (pip)
266
+ └── modules/file_utils.py
267
+ β”œβ”€β”€ Pillow (pip)
268
+ └── requests (pip)
269
+ ```
270
+
271
+ ---
272
+
273
+ ## File Checklist
274
+
275
+ | # | File | Action | Done |
276
+ |---|------|--------|------|
277
+ | 1 | `.gitignore` | Add `.env` entry | [x] |
278
+ | 2 | `requirements.txt` | Add gradio, dotenv, numpy, Pillow, requests | [x] |
279
+ | 3 | `dockerfile` | Add ffmpeg apt, fix pip installs | [x] |
280
+ | 4 | `app.py` | Create Gradio app with AudioGallery + MCP | [x] |
281
+ | 5 | `modules/constants.py` | Verify local-safe (no crash without HF_TOKEN) | [x] `.env` present β€” no code change needed |
282
+
283
+ ---
284
+
285
+ ## Notes
286
+
287
+ - **Deno**: Required by yt-dlp for some YouTube JS extraction. Dockerfile installs it
288
+ from `deno.land/install.sh`. Locally, download from
289
+ https://github.com/denoland/deno/releases/latest/download/deno-x86_64-pc-windows-msvc.zip
290
+ and add `deno.exe` to PATH or project root.
291
+ - **Demucs model**: `htdemucs_6s` downloads ~1.5 GB on first run. In Docker, this
292
+ happens at runtime unless pre-cached in image.
293
+ - **Python style**: Black + ruff + isort per agent conventions. PEP 8, 4-space indent,
294
+ 79-char lines.
295
+ - **AudioGallery JS**: Use `{{ }}` for JS template literals inside Python f-strings
296
+ (py.instructions.md rule).
yt_audio_get_tracks.py ADDED
@@ -0,0 +1,68 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # yt_separator.py
2
+ # pip install yt-dlp demucs pydub (ffmpeg required)
3
+ import os, sys, subprocess
4
+ import shutil
5
+ import yt_dlp
6
+ from pydub import AudioSegment
7
+
8
+ def download_audio(url, video_id):
9
+ temp_dir = 'separated'
10
+ os.makedirs(temp_dir, exist_ok=True)
11
+ ydl_opts = {
12
+ 'format': 'bestaudio/best',
13
+ 'outtmpl': os.path.join(temp_dir, f'{video_id}.%(ext)s'),
14
+ 'postprocessors': [{'key': 'FFmpegExtractAudio', 'preferredcodec': 'wav'}],
15
+ 'keepvideo': True,
16
+ 'quiet': False,
17
+ 'no_warnings': False,
18
+ 'user_agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36',
19
+ 'http_headers': {'Referer': 'https://www.youtube.com/'},
20
+ # 'cookiesfrombrowser': ('chrome', None, None),
21
+ }
22
+
23
+ if shutil.which('deno') is None:
24
+ print("⚠️ Deno not found.")
25
+ ydl_opts['compat_opts'] = ['no-youtube-js']
26
+
27
+ with yt_dlp.YoutubeDL(ydl_opts) as ydl:
28
+ ydl.download([url])
29
+ return os.path.join(temp_dir, f'{video_id}.wav')
30
+
31
+ def separate_tracks(input_wav, video_id):
32
+ if not os.path.exists(input_wav):
33
+ raise FileNotFoundError(f"{input_wav} does not exist")
34
+
35
+ output_dir = 'separated'
36
+ subprocess.run(['demucs', '-n', 'htdemucs_6s', '--mp3', '--out', output_dir, input_wav], check=True)
37
+
38
+ base = os.path.join(output_dir, 'htdemucs_6s', video_id)
39
+
40
+ drums = f'{base}/drums.mp3'
41
+ vocals = f'{base}/vocals.mp3'
42
+ bass = f'{base}/bass.mp3'
43
+ guitar = f'{base}/guitar.mp3'
44
+ piano = f'{base}/piano.mp3'
45
+ other = f'{base}/other.mp3'
46
+
47
+ music = AudioSegment.from_mp3(bass).overlay(AudioSegment.from_mp3(other))
48
+ music_path = os.path.join(base, 'music.mp3')
49
+ music.export(music_path, format="mp3")
50
+
51
+ os.remove(input_wav)
52
+
53
+ return drums, vocals, guitar, bass, other, piano, music_path
54
+
55
+
56
+ def main():
57
+ video_id = input("enter youtube video id: ")
58
+ url = f"https://www.youtube.com/watch?v={video_id}"
59
+ try:
60
+ wav = download_audio(url, video_id)
61
+ d, v, g, b, o, p, m = separate_tracks(wav, video_id)
62
+ print(d, v, g, b, o, p, m)
63
+ except Exception as exc:
64
+ print(exc)
65
+
66
+
67
+ if __name__ == "__main__":
68
+ main()