VibecoderMcSwaggins commited on
Commit
2dc022a
Β·
1 Parent(s): 2763672

feat(architecture): update system design and roadmap for Phases 1-8

Browse files

- Revised high-level design to include detailed phases and components of the Magentic system.
- Introduced new agents: Magentic Manager, SearchAgent, HypothesisAgent, JudgeAgent, and ReportAgent, each with specific roles in the research process.
- Updated success criteria to reflect completion of Phases 1-5 and outlined specifications for Phases 6-8.
- Enhanced documentation for clarity on system architecture and agent interactions.

docs/architecture/overview.md CHANGED
@@ -63,53 +63,58 @@ Using existing approved drugs to treat NEW diseases they weren't originally desi
63
 
64
  ## System Architecture
65
 
66
- ### High-Level Design
67
 
68
  ```
69
- User Question
70
  ↓
71
- Research Agent (Orchestrator)
72
  ↓
73
- Search Loop:
74
- 1. Query Tools (PubMed, Web, Clinical Trials)
75
- 2. Gather Evidence
76
- 3. Judge Quality ("Do we have enough?")
77
- 4. If NO β†’ Refine query, search more
78
- 5. If YES β†’ Synthesize findings
79
  ↓
80
- Research Report with Citations
81
  ```
82
 
83
  ### Key Components
84
 
85
- 1. **Research Agent (Orchestrator)**
86
- - Manages the research process
87
- - Plans search strategies
88
- - Coordinates tools
89
- - Tracks token budget and iterations
90
-
91
- 2. **Tools**
92
- - PubMed Search (biomedical papers)
93
- - Web Search (general medical info)
94
- - Clinical Trials Database
95
- - Drug Information APIs
96
- - (Future: Protein databases, pathways)
97
-
98
- 3. **Judge System**
99
- - LLM-based quality assessment
100
- - Evaluates: "Do we have enough evidence?"
101
- - Criteria: Coverage, reliability, citation quality
102
-
103
- 4. **Break Conditions**
104
- - Token budget cap (cost control)
105
- - Max iterations (time control)
106
- - Judge says "sufficient evidence" (quality control)
107
-
108
- 5. **Gradio UI**
109
- - Simple text input for questions
110
- - Real-time progress display
111
- - Formatted research report output
112
- - Source citations and links
 
 
 
 
 
 
113
 
114
  ---
115
 
@@ -275,37 +280,31 @@ httpx = "^0.27"
275
 
276
  ## Success Criteria
277
 
278
- ### Minimum Viable Product (MVP) - Days 1-3
279
- **MUST HAVE for working demo:**
280
  - [x] User can ask drug repurposing question
281
- - [ ] Agent searches PubMed (async)
282
- - [ ] Agent searches web (Brave/DuckDuckGo)
283
- - [ ] LLM judge evaluates evidence quality
284
- - [ ] System respects token budget (50K tokens max)
285
- - [ ] Output includes drug candidates + citations
286
- - [ ] Works end-to-end for demo query: "Long COVID fatigue"
287
- - [ ] Gradio UI with streaming progress
288
-
289
- ### Hackathon Submission - Days 4-5
290
- **Required for all tracks:**
291
- - [ ] Gradio UI deployed on HuggingFace Spaces
292
- - [ ] 3 example queries working and tested
293
- - [ ] This architecture documentation
294
- - [ ] Demo video (2-3 min) showing workflow
295
- - [ ] README with setup instructions
296
-
297
- **Track-Specific:**
298
- - [ ] **Gradio Track**: Streaming UI, progress indicators, modern design
299
- - [ ] **MCP Track**: PubMed tool as MCP server (reusable by others)
300
- - [ ] **Modal Track**: GPU inference option (stretch)
301
-
302
- ### Stretch Goals - Day 6+
303
- **Nice-to-have if time permits:**
304
- - [ ] Modal integration for local LLM fallback
305
- - [ ] Clinical trials database search
306
- - [ ] Checkpoint/resume functionality
307
- - [ ] OpenFDA drug safety lookup
308
- - [ ] PDF export of research reports
309
 
310
  ### What's EXPLICITLY Out of Scope
311
  **NOT building (to stay focused):**
 
63
 
64
  ## System Architecture
65
 
66
+ ### High-Level Design (Phases 1-8)
67
 
68
  ```
69
+ User Query
70
  ↓
71
+ Gradio UI (Phase 4)
72
  ↓
73
+ Magentic Manager (Phase 5) ← LLM-powered coordinator
74
+ β”œβ”€β”€ SearchAgent (Phase 2+5) ←→ PubMed + Web + VectorDB (Phase 6)
75
+ β”œβ”€β”€ HypothesisAgent (Phase 7) ←→ Mechanistic Reasoning
76
+ β”œβ”€β”€ JudgeAgent (Phase 3+5) ←→ Evidence Assessment
77
+ └── ReportAgent (Phase 8) ←→ Final Synthesis
 
78
  ↓
79
+ Structured Research Report
80
  ```
81
 
82
  ### Key Components
83
 
84
+ 1. **Magentic Manager (Orchestrator)**
85
+ - LLM-powered multi-agent coordinator
86
+ - Dynamic planning and agent selection
87
+ - Built-in stall detection and replanning
88
+ - Microsoft Agent Framework integration
89
+
90
+ 2. **SearchAgent (Phase 2+5+6)**
91
+ - PubMed E-utilities search
92
+ - DuckDuckGo web search
93
+ - Semantic search via ChromaDB (Phase 6)
94
+ - Evidence deduplication
95
+
96
+ 3. **HypothesisAgent (Phase 7)**
97
+ - Generates Drug β†’ Target β†’ Pathway β†’ Effect hypotheses
98
+ - Guides targeted searches
99
+ - Scientific reasoning about mechanisms
100
+
101
+ 4. **JudgeAgent (Phase 3+5)**
102
+ - LLM-based evidence assessment
103
+ - Mechanism score + Clinical score
104
+ - Recommends continue/synthesize
105
+ - Generates refined search queries
106
+
107
+ 5. **ReportAgent (Phase 8)**
108
+ - Structured scientific reports
109
+ - Executive summary, methodology
110
+ - Hypotheses tested with evidence counts
111
+ - Proper citations and limitations
112
+
113
+ 6. **Gradio UI (Phase 4)**
114
+ - Chat interface for questions
115
+ - Real-time progress via events
116
+ - Mode toggle (Simple/Magentic)
117
+ - Formatted markdown output
118
 
119
  ---
120
 
 
280
 
281
  ## Success Criteria
282
 
283
+ ### Phase 1-5 (MVP) βœ… COMPLETE
284
+ **Completed in ONE DAY:**
285
  - [x] User can ask drug repurposing question
286
+ - [x] Agent searches PubMed (async)
287
+ - [x] Agent searches web (DuckDuckGo)
288
+ - [x] LLM judge evaluates evidence quality
289
+ - [x] System respects token budget and iterations
290
+ - [x] Output includes drug candidates + citations
291
+ - [x] Works end-to-end for demo query
292
+ - [x] Gradio UI with streaming progress
293
+ - [x] Magentic multi-agent orchestration
294
+ - [x] 38 unit tests passing
295
+ - [x] CI/CD pipeline green
296
+
297
+ ### Hackathon Submission βœ… COMPLETE
298
+ - [x] Gradio UI deployed on HuggingFace Spaces
299
+ - [x] Example queries working and tested
300
+ - [x] Architecture documentation
301
+ - [x] README with setup instructions
302
+
303
+ ### Phase 6-8 (Enhanced)
304
+ **Specs ready for implementation:**
305
+ - [ ] Embeddings & Semantic Search (Phase 6)
306
+ - [ ] Hypothesis Agent (Phase 7)
307
+ - [ ] Report Agent (Phase 8)
 
 
 
 
 
 
308
 
309
  ### What's EXPLICITLY Out of Scope
310
  **NOT building (to stay focused):**
docs/implementation/roadmap.md CHANGED
@@ -115,26 +115,96 @@ tests/
115
 
116
  ---
117
 
118
- ### **Phase 5: Magentic Integration (OPTIONAL - Post-MVP)**
119
 
120
  *Goal: Upgrade orchestrator to use Microsoft Agent Framework patterns.*
121
 
122
- - [ ] Wrap SearchHandler as `AgentProtocol` (SearchAgent) with strict protocol compliance.
123
- - [ ] Wrap JudgeHandler as `AgentProtocol` (JudgeAgent) with strict protocol compliance.
124
- - [ ] Implement `MagenticOrchestrator` using `MagenticBuilder`.
125
- - [ ] Create factory pattern for switching implementations.
126
  - **Deliverable**: Same API, better multi-agent orchestration engine.
127
 
128
- **NOTE**: Only implement Phase 5 if time permits after MVP is shipped.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
129
 
130
  ---
131
 
132
  ## Spec Documents
133
 
134
- 1. **[Phase 1 Spec: Foundation](01_phase_foundation.md)**
135
- 2. **[Phase 2 Spec: Search Slice](02_phase_search.md)**
136
- 3. **[Phase 3 Spec: Judge Slice](03_phase_judge.md)**
137
- 4. **[Phase 4 Spec: UI & Loop](04_phase_ui.md)**
138
- 5. **[Phase 5 Spec: Magentic Integration](05_phase_magentic.md)** *(Optional)*
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
139
 
140
- *Start by reading Phase 1 Spec to initialize the repo.*
 
115
 
116
  ---
117
 
118
+ ### **Phase 5: Magentic Integration** βœ… COMPLETE
119
 
120
  *Goal: Upgrade orchestrator to use Microsoft Agent Framework patterns.*
121
 
122
+ - [x] Wrap SearchHandler as `AgentProtocol` (SearchAgent) with strict protocol compliance.
123
+ - [x] Wrap JudgeHandler as `AgentProtocol` (JudgeAgent) with strict protocol compliance.
124
+ - [x] Implement `MagenticOrchestrator` using `MagenticBuilder`.
125
+ - [x] Create factory pattern for switching implementations.
126
  - **Deliverable**: Same API, better multi-agent orchestration engine.
127
 
128
+ ---
129
+
130
+ ### **Phase 6: Embeddings & Semantic Search**
131
+
132
+ *Goal: Add vector search for semantic evidence retrieval.*
133
+
134
+ - [ ] Implement `EmbeddingService` with ChromaDB.
135
+ - [ ] Add semantic deduplication to SearchAgent.
136
+ - [ ] Enable semantic search for related evidence.
137
+ - [ ] Store embeddings in shared context.
138
+ - **Deliverable**: Find semantically related papers, not just keyword matches.
139
+
140
+ ---
141
+
142
+ ### **Phase 7: Hypothesis Agent**
143
+
144
+ *Goal: Generate scientific hypotheses to guide targeted searches.*
145
+
146
+ - [ ] Implement `MechanismHypothesis` and `HypothesisAssessment` models.
147
+ - [ ] Implement `HypothesisAgent` for mechanistic reasoning.
148
+ - [ ] Add hypothesis-driven search queries.
149
+ - [ ] Integrate into Magentic workflow.
150
+ - **Deliverable**: Drug β†’ Target β†’ Pathway β†’ Effect hypotheses that guide research.
151
+
152
+ ---
153
+
154
+ ### **Phase 8: Report Agent**
155
+
156
+ *Goal: Generate structured scientific reports with proper citations.*
157
+
158
+ - [ ] Implement `ResearchReport` model with all sections.
159
+ - [ ] Implement `ReportAgent` for synthesis.
160
+ - [ ] Include methodology, limitations, formatted references.
161
+ - [ ] Integrate as final synthesis step in Magentic workflow.
162
+ - **Deliverable**: Publication-quality research reports.
163
+
164
+ ---
165
+
166
+ ## Complete Architecture (Phases 1-8)
167
+
168
+ ```
169
+ User Query
170
+ ↓
171
+ Gradio UI (Phase 4)
172
+ ↓
173
+ Magentic Manager (Phase 5)
174
+ β”œβ”€β”€ SearchAgent (Phase 2+5) ←→ PubMed + Web + VectorDB (Phase 6)
175
+ β”œβ”€β”€ HypothesisAgent (Phase 7) ←→ Mechanistic Reasoning
176
+ β”œβ”€β”€ JudgeAgent (Phase 3+5) ←→ Evidence Assessment
177
+ └── ReportAgent (Phase 8) ←→ Final Synthesis
178
+ ↓
179
+ Structured Research Report
180
+ ```
181
 
182
  ---
183
 
184
  ## Spec Documents
185
 
186
+ 1. **[Phase 1 Spec: Foundation](01_phase_foundation.md)** βœ…
187
+ 2. **[Phase 2 Spec: Search Slice](02_phase_search.md)** βœ…
188
+ 3. **[Phase 3 Spec: Judge Slice](03_phase_judge.md)** βœ…
189
+ 4. **[Phase 4 Spec: UI & Loop](04_phase_ui.md)** βœ…
190
+ 5. **[Phase 5 Spec: Magentic Integration](05_phase_magentic.md)** βœ…
191
+ 6. **[Phase 6 Spec: Embeddings & Semantic Search](06_phase_embeddings.md)**
192
+ 7. **[Phase 7 Spec: Hypothesis Agent](07_phase_hypothesis.md)**
193
+ 8. **[Phase 8 Spec: Report Agent](08_phase_report.md)**
194
+
195
+ ---
196
+
197
+ ## Progress Summary
198
+
199
+ | Phase | Status | Deliverable |
200
+ |-------|--------|-------------|
201
+ | Phase 1: Foundation | βœ… COMPLETE | CI-ready repo with uv/pytest |
202
+ | Phase 2: Search | βœ… COMPLETE | PubMed + Web search |
203
+ | Phase 3: Judge | βœ… COMPLETE | LLM evidence assessment |
204
+ | Phase 4: UI & Loop | βœ… COMPLETE | Working Gradio app |
205
+ | Phase 5: Magentic | βœ… COMPLETE | Multi-agent orchestration |
206
+ | Phase 6: Embeddings | πŸ“ SPEC READY | Semantic search |
207
+ | Phase 7: Hypothesis | πŸ“ SPEC READY | Mechanistic reasoning |
208
+ | Phase 8: Report | πŸ“ SPEC READY | Structured reports |
209
 
210
+ *Phases 1-5 completed in ONE DAY. Phases 6-8 specs ready for implementation.*