Spaces:

DataQuests
/

DeepCritical

Running

VibecoderMcSwaggins commited on 11 days ago

Commit

2dc022a

1 Parent(s): 2763672

feat(architecture): update system design and roadmap for Phases 1-8

- Revised high-level design to include detailed phases and components of the Magentic system.
- Introduced new agents: Magentic Manager, SearchAgent, HypothesisAgent, JudgeAgent, and ReportAgent, each with specific roles in the research process.
- Updated success criteria to reflect completion of Phases 1-5 and outlined specifications for Phases 6-8.
- Enhanced documentation for clarity on system architecture and agent interactions.

Files changed (2) hide show

docs/architecture/overview.md +67 -68
docs/implementation/roadmap.md +82 -12

docs/architecture/overview.md CHANGED Viewed

@@ -63,53 +63,58 @@ Using existing approved drugs to treat NEW diseases they weren't originally desi
 ## System Architecture
-### High-Level Design
 ```
-User Question
     ↓
-Research Agent (Orchestrator)
     ↓
-Search Loop:
-  1. Query Tools (PubMed, Web, Clinical Trials)
-  2. Gather Evidence
-  3. Judge Quality ("Do we have enough?")
-  4. If NO → Refine query, search more
-  5. If YES → Synthesize findings
     ↓
-Research Report with Citations
 ```
 ### Key Components
-1. **Research Agent (Orchestrator)**
-   - Manages the research process
-   - Plans search strategies
-   - Coordinates tools
-   - Tracks token budget and iterations
-2. **Tools**
-   - PubMed Search (biomedical papers)
-   - Web Search (general medical info)
-   - Clinical Trials Database
-   - Drug Information APIs
-   - (Future: Protein databases, pathways)
-3. **Judge System**
-   - LLM-based quality assessment
-   - Evaluates: "Do we have enough evidence?"
-   - Criteria: Coverage, reliability, citation quality
-4. **Break Conditions**
-   - Token budget cap (cost control)
-   - Max iterations (time control)
-   - Judge says "sufficient evidence" (quality control)
-5. **Gradio UI**
-   - Simple text input for questions
-   - Real-time progress display
-   - Formatted research report output
-   - Source citations and links
 ---
@@ -275,37 +280,31 @@ httpx = "^0.27"
 ## Success Criteria
-### Minimum Viable Product (MVP) - Days 1-3
-**MUST HAVE for working demo:**
 - [x] User can ask drug repurposing question
-- [ ] Agent searches PubMed (async)
-- [ ] Agent searches web (Brave/DuckDuckGo)
-- [ ] LLM judge evaluates evidence quality
-- [ ] System respects token budget (50K tokens max)
-- [ ] Output includes drug candidates + citations
-- [ ] Works end-to-end for demo query: "Long COVID fatigue"
-- [ ] Gradio UI with streaming progress
-### Hackathon Submission - Days 4-5
-**Required for all tracks:**
-- [ ] Gradio UI deployed on HuggingFace Spaces
-- [ ] 3 example queries working and tested
-- [ ] This architecture documentation
-- [ ] Demo video (2-3 min) showing workflow
-- [ ] README with setup instructions
-**Track-Specific:**
-- [ ] **Gradio Track**: Streaming UI, progress indicators, modern design
-- [ ] **MCP Track**: PubMed tool as MCP server (reusable by others)
-- [ ] **Modal Track**: GPU inference option (stretch)
-### Stretch Goals - Day 6+
-**Nice-to-have if time permits:**
-- [ ] Modal integration for local LLM fallback
-- [ ] Clinical trials database search
-- [ ] Checkpoint/resume functionality
-- [ ] OpenFDA drug safety lookup
-- [ ] PDF export of research reports
 ### What's EXPLICITLY Out of Scope
 **NOT building (to stay focused):**

 ## System Architecture
+### High-Level Design (Phases 1-8)
 ```
+User Query
     ↓
+Gradio UI (Phase 4)
     ↓
+Magentic Manager (Phase 5) ← LLM-powered coordinator
+    ├── SearchAgent (Phase 2+5) ←→ PubMed + Web + VectorDB (Phase 6)
+    ├── HypothesisAgent (Phase 7) ←→ Mechanistic Reasoning
+    ├── JudgeAgent (Phase 3+5) ←→ Evidence Assessment
+    └── ReportAgent (Phase 8) ←→ Final Synthesis
     ↓
+Structured Research Report
 ```
 ### Key Components
+1. **Magentic Manager (Orchestrator)**
+   - LLM-powered multi-agent coordinator
+   - Dynamic planning and agent selection
+   - Built-in stall detection and replanning
+   - Microsoft Agent Framework integration
+2. **SearchAgent (Phase 2+5+6)**
+   - PubMed E-utilities search
+   - DuckDuckGo web search
+   - Semantic search via ChromaDB (Phase 6)
+   - Evidence deduplication
+3. **HypothesisAgent (Phase 7)**
+   - Generates Drug → Target → Pathway → Effect hypotheses
+   - Guides targeted searches
+   - Scientific reasoning about mechanisms
+4. **JudgeAgent (Phase 3+5)**
+   - LLM-based evidence assessment
+   - Mechanism score + Clinical score
+   - Recommends continue/synthesize
+   - Generates refined search queries
+5. **ReportAgent (Phase 8)**
+   - Structured scientific reports
+   - Executive summary, methodology
+   - Hypotheses tested with evidence counts
+   - Proper citations and limitations
+6. **Gradio UI (Phase 4)**
+   - Chat interface for questions
+   - Real-time progress via events
+   - Mode toggle (Simple/Magentic)
+   - Formatted markdown output
 ---
 ## Success Criteria
+### Phase 1-5 (MVP) ✅ COMPLETE
+**Completed in ONE DAY:**
 - [x] User can ask drug repurposing question
+- [x] Agent searches PubMed (async)
+- [x] Agent searches web (DuckDuckGo)
+- [x] LLM judge evaluates evidence quality
+- [x] System respects token budget and iterations
+- [x] Output includes drug candidates + citations
+- [x] Works end-to-end for demo query
+- [x] Gradio UI with streaming progress
+- [x] Magentic multi-agent orchestration
+- [x] 38 unit tests passing
+- [x] CI/CD pipeline green
+### Hackathon Submission ✅ COMPLETE
+- [x] Gradio UI deployed on HuggingFace Spaces
+- [x] Example queries working and tested
+- [x] Architecture documentation
+- [x] README with setup instructions
+### Phase 6-8 (Enhanced)
+**Specs ready for implementation:**
+- [ ] Embeddings & Semantic Search (Phase 6)
+- [ ] Hypothesis Agent (Phase 7)
+- [ ] Report Agent (Phase 8)
 ### What's EXPLICITLY Out of Scope
 **NOT building (to stay focused):**

docs/implementation/roadmap.md CHANGED Viewed

@@ -115,26 +115,96 @@ tests/
 ---
-### **Phase 5: Magentic Integration (OPTIONAL - Post-MVP)**
 *Goal: Upgrade orchestrator to use Microsoft Agent Framework patterns.*
-- [ ] Wrap SearchHandler as `AgentProtocol` (SearchAgent) with strict protocol compliance.
-- [ ] Wrap JudgeHandler as `AgentProtocol` (JudgeAgent) with strict protocol compliance.
-- [ ] Implement `MagenticOrchestrator` using `MagenticBuilder`.
-- [ ] Create factory pattern for switching implementations.
 - **Deliverable**: Same API, better multi-agent orchestration engine.
-**NOTE**: Only implement Phase 5 if time permits after MVP is shipped.
 ---
 ## Spec Documents
-1. **[Phase 1 Spec: Foundation](01_phase_foundation.md)**
-2. **[Phase 2 Spec: Search Slice](02_phase_search.md)**
-3. **[Phase 3 Spec: Judge Slice](03_phase_judge.md)**
-4. **[Phase 4 Spec: UI & Loop](04_phase_ui.md)**
-5. **[Phase 5 Spec: Magentic Integration](05_phase_magentic.md)** *(Optional)*
-*Start by reading Phase 1 Spec to initialize the repo.*

 ---
+### **Phase 5: Magentic Integration** ✅ COMPLETE
 *Goal: Upgrade orchestrator to use Microsoft Agent Framework patterns.*
+- [x] Wrap SearchHandler as `AgentProtocol` (SearchAgent) with strict protocol compliance.
+- [x] Wrap JudgeHandler as `AgentProtocol` (JudgeAgent) with strict protocol compliance.
+- [x] Implement `MagenticOrchestrator` using `MagenticBuilder`.
+- [x] Create factory pattern for switching implementations.
 - **Deliverable**: Same API, better multi-agent orchestration engine.
+---
+### **Phase 6: Embeddings & Semantic Search**
+*Goal: Add vector search for semantic evidence retrieval.*
+- [ ] Implement `EmbeddingService` with ChromaDB.
+- [ ] Add semantic deduplication to SearchAgent.
+- [ ] Enable semantic search for related evidence.
+- [ ] Store embeddings in shared context.
+- **Deliverable**: Find semantically related papers, not just keyword matches.
+---
+### **Phase 7: Hypothesis Agent**
+*Goal: Generate scientific hypotheses to guide targeted searches.*
+- [ ] Implement `MechanismHypothesis` and `HypothesisAssessment` models.
+- [ ] Implement `HypothesisAgent` for mechanistic reasoning.
+- [ ] Add hypothesis-driven search queries.
+- [ ] Integrate into Magentic workflow.
+- **Deliverable**: Drug → Target → Pathway → Effect hypotheses that guide research.
+---
+### **Phase 8: Report Agent**
+*Goal: Generate structured scientific reports with proper citations.*
+- [ ] Implement `ResearchReport` model with all sections.
+- [ ] Implement `ReportAgent` for synthesis.
+- [ ] Include methodology, limitations, formatted references.
+- [ ] Integrate as final synthesis step in Magentic workflow.
+- **Deliverable**: Publication-quality research reports.
+---
+## Complete Architecture (Phases 1-8)
+```
+User Query
+    ↓
+Gradio UI (Phase 4)
+    ↓
+Magentic Manager (Phase 5)
+    ├── SearchAgent (Phase 2+5) ←→ PubMed + Web + VectorDB (Phase 6)
+    ├── HypothesisAgent (Phase 7) ←→ Mechanistic Reasoning
+    ├── JudgeAgent (Phase 3+5) ←→ Evidence Assessment
+    └── ReportAgent (Phase 8) ←→ Final Synthesis
+    ↓
+Structured Research Report
+```
 ---
 ## Spec Documents
+1. **[Phase 1 Spec: Foundation](01_phase_foundation.md)** ✅
+2. **[Phase 2 Spec: Search Slice](02_phase_search.md)** ✅
+3. **[Phase 3 Spec: Judge Slice](03_phase_judge.md)** ✅
+4. **[Phase 4 Spec: UI & Loop](04_phase_ui.md)** ✅
+5. **[Phase 5 Spec: Magentic Integration](05_phase_magentic.md)** ✅
+6. **[Phase 6 Spec: Embeddings & Semantic Search](06_phase_embeddings.md)**
+7. **[Phase 7 Spec: Hypothesis Agent](07_phase_hypothesis.md)**
+8. **[Phase 8 Spec: Report Agent](08_phase_report.md)**
+---
+## Progress Summary
+| Phase | Status | Deliverable |
+|-------|--------|-------------|
+| Phase 1: Foundation | ✅ COMPLETE | CI-ready repo with uv/pytest |
+| Phase 2: Search | ✅ COMPLETE | PubMed + Web search |
+| Phase 3: Judge | ✅ COMPLETE | LLM evidence assessment |
+| Phase 4: UI & Loop | ✅ COMPLETE | Working Gradio app |
+| Phase 5: Magentic | ✅ COMPLETE | Multi-agent orchestration |
+| Phase 6: Embeddings | 📝 SPEC READY | Semantic search |
+| Phase 7: Hypothesis | 📝 SPEC READY | Mechanistic reasoning |
+| Phase 8: Report | 📝 SPEC READY | Structured reports |
+*Phases 1-5 completed in ONE DAY. Phases 6-8 specs ready for implementation.*