VibecoderMcSwaggins commited on
Commit
cfb473d
·
1 Parent(s): 0efdc2f

refactor(examples): purge all mocks - real API calls only

Browse files

NO MOCKS. NO FAKE DATA. REAL SCIENCE.

Changes:
- hypothesis_demo: Now does REAL search before hypothesis generation
- full_stack_demo: Removed run_mock_demo(), create_mock_*() functions
- orchestrator_demo: Removed --mock flag and MockJudgeHandler
- README: Updated to reflect "Real or Nothing" philosophy

All examples now require API keys and make real API calls.
Mocks belong in tests/unit/, not in demos.

examples/README.md CHANGED
@@ -1,181 +1,183 @@
1
  # DeepCritical Examples
2
 
3
- Demo scripts demonstrating each phase of the drug repurposing research agent.
4
 
5
- ## Quick Start
 
 
 
 
 
 
6
 
7
  ```bash
8
- # Run without API keys (mock modes available)
9
- uv run python examples/embeddings_demo/run_embeddings.py
10
- uv run python examples/full_stack_demo/run_full.py --mock
 
 
 
11
 
12
- # Run with API keys (set OPENAI_API_KEY or ANTHROPIC_API_KEY)
13
- uv run python examples/full_stack_demo/run_full.py "metformin cancer"
14
  ```
15
 
16
  ---
17
 
18
- ## 1. Search Demo (Phase 2)
 
 
19
 
20
- Demonstrates parallel search across PubMed and Web sources. **No API keys required.**
21
 
22
  ```bash
23
  uv run python examples/search_demo/run_search.py "metformin cancer"
24
  ```
25
 
26
- **What it shows:**
27
- - PubMed E-utilities search
28
- - DuckDuckGo web search
29
- - Scatter-gather orchestration
30
- - Evidence model with citations
31
 
32
  ---
33
 
34
- ## 2. Agent Demo (Phase 4)
35
 
36
- Demonstrates the search-judge-synthesize loop.
37
 
38
- **Mock Mode (No API Keys):**
39
  ```bash
40
- uv run python examples/orchestrator_demo/run_agent.py "metformin cancer" --mock
41
- ```
42
-
43
- **Real Mode (Requires API Keys):**
44
- ```bash
45
- uv run python examples/orchestrator_demo/run_agent.py "metformin cancer"
46
  ```
47
 
48
- **What it shows:**
49
- - Iterative search refinement
50
- - LLM-based evidence assessment
51
- - Synthesis generation
52
- - Event streaming for UI updates
53
 
54
  ---
55
 
56
- ## 3. Magentic Demo (Phase 5)
57
 
58
- Demonstrates multi-agent coordination using Microsoft Agent Framework.
59
 
60
  ```bash
61
- # Requires OPENAI_API_KEY (Magentic uses OpenAI)
62
- uv run python examples/orchestrator_demo/run_magentic.py "metformin cancer"
63
  ```
64
 
65
- **What it shows:**
66
- - MagenticBuilder workflow
67
- - SearchAgent, JudgeAgent, HypothesisAgent, ReportAgent coordination
68
- - Manager-based orchestration
 
69
 
70
  ---
71
 
72
- ## 4. Embeddings Demo (Phase 6)
73
 
74
- Demonstrates semantic search and deduplication. **No API keys required.**
75
 
76
  ```bash
77
- uv run python examples/embeddings_demo/run_embeddings.py
 
78
  ```
79
 
80
- **What it shows:**
81
- - Text embedding with sentence-transformers
82
- - ChromaDB vector storage
83
- - Semantic similarity search
84
- - Duplicate detection by meaning (not just URL)
85
- - Cosine similarity calculations
86
 
87
  ---
88
 
89
- ## 5. Hypothesis Demo (Phase 7)
90
 
91
- Demonstrates mechanistic hypothesis generation.
92
 
93
  ```bash
94
- # Requires OPENAI_API_KEY or ANTHROPIC_API_KEY
95
  uv run python examples/hypothesis_demo/run_hypothesis.py "metformin Alzheimer's"
96
  uv run python examples/hypothesis_demo/run_hypothesis.py "sildenafil heart failure"
97
  ```
98
 
99
- **What it shows:**
100
- - Drug -> Target -> Pathway -> Effect reasoning
101
- - Knowledge gap identification
102
- - Search query suggestions for targeted research
103
- - Confidence scoring
104
 
105
  ---
106
 
107
- ## 6. Full Stack Demo (Phases 1-8)
108
 
109
- **The complete pipeline** - demonstrates all phases working together.
110
 
111
- **Mock Mode (No API Keys):**
112
- ```bash
113
- uv run python examples/full_stack_demo/run_full.py --mock
114
- ```
115
-
116
- **Real Mode:**
117
  ```bash
118
  uv run python examples/full_stack_demo/run_full.py "metformin Alzheimer's"
119
  uv run python examples/full_stack_demo/run_full.py "sildenafil heart failure" -i 3
120
  ```
121
 
122
- **What it shows:**
123
- 1. **Search** - PubMed + Web evidence collection
124
- 2. **Embeddings** - Semantic deduplication
125
- 3. **Hypothesis** - Mechanistic reasoning
126
- 4. **Judge** - Evidence quality assessment
127
- 5. **Report** - Structured scientific report generation
128
-
129
- Output includes a publication-quality research report with:
130
- - Executive summary
131
- - Methodology
132
- - Hypotheses tested (with support/contradict counts)
133
- - Mechanistic and clinical findings
134
- - Drug candidates
135
- - Limitations
136
- - Formatted references
137
 
138
  ---
139
 
140
- ## API Keys
141
 
142
- | Example | Required Keys |
143
- |---------|--------------|
144
- | search_demo | None (optional NCBI_API_KEY for higher rate limits) |
145
- | orchestrator_demo --mock | None |
146
- | orchestrator_demo | OPENAI_API_KEY or ANTHROPIC_API_KEY |
147
- | run_magentic | OPENAI_API_KEY |
148
- | embeddings_demo | None |
149
- | hypothesis_demo | OPENAI_API_KEY or ANTHROPIC_API_KEY |
150
- | full_stack_demo --mock | None |
151
- | full_stack_demo | OPENAI_API_KEY or ANTHROPIC_API_KEY |
152
 
153
  ---
154
 
155
- ## Architecture Overview
156
 
157
  ```
158
  User Query
159
  |
160
  v
161
- [Phase 2: Search] --> PubMed + Web
162
  |
163
  v
164
- [Phase 6: Embeddings] --> Semantic Deduplication
165
  |
166
  v
167
- [Phase 7: Hypothesis] --> Drug -> Target -> Pathway -> Effect
168
  |
169
  v
170
- [Phase 3: Judge] --> "Is evidence sufficient?"
171
  |
172
- +---> NO --> Refine queries, loop back to Search
173
  |
174
- +---> YES --> Continue to Report
175
  |
176
  v
177
- [Phase 8: Report] --> Structured Scientific Report
178
  |
179
  v
180
- Final Output with Citations
181
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # DeepCritical Examples
2
 
3
+ **NO MOCKS. NO FAKE DATA. REAL SCIENCE.**
4
 
5
+ These demos run the REAL drug repurposing research pipeline with actual API calls.
6
+
7
+ ---
8
+
9
+ ## Prerequisites
10
+
11
+ You MUST have API keys configured:
12
 
13
  ```bash
14
+ # Copy the example and add your keys
15
+ cp .env.example .env
16
+
17
+ # Required (pick one):
18
+ OPENAI_API_KEY=sk-...
19
+ ANTHROPIC_API_KEY=sk-ant-...
20
 
21
+ # Optional (higher PubMed rate limits):
22
+ NCBI_API_KEY=your-key
23
  ```
24
 
25
  ---
26
 
27
+ ## Examples
28
+
29
+ ### 1. Search Demo (No LLM Required)
30
 
31
+ Demonstrates REAL parallel search across PubMed and Web.
32
 
33
  ```bash
34
  uv run python examples/search_demo/run_search.py "metformin cancer"
35
  ```
36
 
37
+ **What's REAL:**
38
+ - Actual NCBI E-utilities API calls
39
+ - Actual DuckDuckGo web searches
40
+ - Real papers, real URLs, real content
 
41
 
42
  ---
43
 
44
+ ### 2. Embeddings Demo (No LLM Required)
45
 
46
+ Demonstrates REAL semantic search and deduplication.
47
 
 
48
  ```bash
49
+ uv run python examples/embeddings_demo/run_embeddings.py
 
 
 
 
 
50
  ```
51
 
52
+ **What's REAL:**
53
+ - Actual sentence-transformers model (all-MiniLM-L6-v2)
54
+ - Actual ChromaDB vector storage
55
+ - Real cosine similarity computations
56
+ - Real semantic deduplication
57
 
58
  ---
59
 
60
+ ### 3. Orchestrator Demo (LLM Required)
61
 
62
+ Demonstrates the REAL search-judge-synthesize loop.
63
 
64
  ```bash
65
+ uv run python examples/orchestrator_demo/run_agent.py "metformin cancer"
66
+ uv run python examples/orchestrator_demo/run_agent.py "aspirin alzheimer" --iterations 5
67
  ```
68
 
69
+ **What's REAL:**
70
+ - Real PubMed + Web searches
71
+ - Real LLM judge evaluating evidence quality
72
+ - Real iterative refinement based on LLM decisions
73
+ - Real research synthesis
74
 
75
  ---
76
 
77
+ ### 4. Magentic Demo (OpenAI Required)
78
 
79
+ Demonstrates REAL multi-agent coordination using Microsoft Agent Framework.
80
 
81
  ```bash
82
+ # Requires OPENAI_API_KEY specifically
83
+ uv run python examples/orchestrator_demo/run_magentic.py "metformin cancer"
84
  ```
85
 
86
+ **What's REAL:**
87
+ - Real MagenticBuilder orchestration
88
+ - Real SearchAgent, JudgeAgent, HypothesisAgent, ReportAgent
89
+ - Real manager-based coordination
 
 
90
 
91
  ---
92
 
93
+ ### 5. Hypothesis Demo (LLM Required)
94
 
95
+ Demonstrates REAL mechanistic hypothesis generation.
96
 
97
  ```bash
 
98
  uv run python examples/hypothesis_demo/run_hypothesis.py "metformin Alzheimer's"
99
  uv run python examples/hypothesis_demo/run_hypothesis.py "sildenafil heart failure"
100
  ```
101
 
102
+ **What's REAL:**
103
+ - Real PubMed + Web search first
104
+ - Real embedding-based deduplication
105
+ - Real LLM generating Drug -> Target -> Pathway -> Effect chains
106
+ - Real knowledge gap identification
107
 
108
  ---
109
 
110
+ ### 6. Full Stack Demo (LLM Required)
111
 
112
+ **THE COMPLETE PIPELINE** - All phases working together.
113
 
 
 
 
 
 
 
114
  ```bash
115
  uv run python examples/full_stack_demo/run_full.py "metformin Alzheimer's"
116
  uv run python examples/full_stack_demo/run_full.py "sildenafil heart failure" -i 3
117
  ```
118
 
119
+ **What's REAL:**
120
+ 1. Real PubMed + Web evidence collection
121
+ 2. Real embedding-based semantic deduplication
122
+ 3. Real LLM mechanistic hypothesis generation
123
+ 4. Real LLM evidence quality assessment
124
+ 5. Real LLM structured scientific report generation
125
+
126
+ Output: Publication-quality research report with validated citations.
 
 
 
 
 
 
 
127
 
128
  ---
129
 
130
+ ## API Key Requirements
131
 
132
+ | Example | LLM Required | Keys |
133
+ |---------|--------------|------|
134
+ | search_demo | No | Optional: `NCBI_API_KEY` |
135
+ | embeddings_demo | No | None |
136
+ | orchestrator_demo | Yes | `OPENAI_API_KEY` or `ANTHROPIC_API_KEY` |
137
+ | run_magentic | Yes | `OPENAI_API_KEY` (Magentic requires OpenAI) |
138
+ | hypothesis_demo | Yes | `OPENAI_API_KEY` or `ANTHROPIC_API_KEY` |
139
+ | full_stack_demo | Yes | `OPENAI_API_KEY` or `ANTHROPIC_API_KEY` |
 
 
140
 
141
  ---
142
 
143
+ ## Architecture
144
 
145
  ```
146
  User Query
147
  |
148
  v
149
+ [REAL Search] --> Actual PubMed + Web API calls
150
  |
151
  v
152
+ [REAL Embeddings] --> Actual sentence-transformers
153
  |
154
  v
155
+ [REAL Hypothesis] --> Actual LLM reasoning
156
  |
157
  v
158
+ [REAL Judge] --> Actual LLM assessment
159
  |
160
+ +---> Need more? --> Loop back to Search
161
  |
162
+ +---> Sufficient --> Continue
163
  |
164
  v
165
+ [REAL Report] --> Actual LLM synthesis
166
  |
167
  v
168
+ Publication-Quality Research Report
169
  ```
170
+
171
+ ---
172
+
173
+ ## Why No Mocks?
174
+
175
+ > "Authenticity is the feature."
176
+
177
+ Mocks belong in `tests/unit/`, not in demos. When you run these examples, you see:
178
+ - Real papers from real databases
179
+ - Real AI reasoning about real evidence
180
+ - Real scientific hypotheses
181
+ - Real research reports
182
+
183
+ This is what DeepCritical actually does. No fake data. No canned responses.
examples/full_stack_demo/run_full.py CHANGED
@@ -2,22 +2,20 @@
2
  """
3
  Demo: Full Stack DeepCritical Agent (Phases 1-8).
4
 
5
- This script demonstrates the COMPLETE drug repurposing research pipeline:
6
- - Phase 2: Search (PubMed + Web)
7
- - Phase 6: Embeddings (Semantic deduplication)
8
- - Phase 7: Hypothesis (Mechanistic reasoning)
9
- - Phase 3: Judge (Evidence assessment)
10
- - Phase 8: Report (Structured scientific report)
 
 
11
 
12
  Usage:
13
- # Full demo with real searches and LLM (requires API keys)
14
  uv run python examples/full_stack_demo/run_full.py "metformin Alzheimer's"
 
15
 
16
- # Mock mode - demonstrates pipeline without API calls
17
- uv run python examples/full_stack_demo/run_full.py --mock
18
-
19
- # With specific iterations
20
- uv run python examples/full_stack_demo/run_full.py "sildenafil heart failure" --iterations 2
21
  """
22
 
23
  import argparse
@@ -26,7 +24,7 @@ import os
26
  import sys
27
  from typing import Any
28
 
29
- from src.utils.models import Citation, Evidence, MechanismHypothesis
30
 
31
 
32
  def print_header(title: str) -> None:
@@ -42,264 +40,15 @@ def print_step(step: int, name: str) -> None:
42
  print("-" * 50)
43
 
44
 
45
- def create_mock_evidence() -> list[Evidence]:
46
- """Create comprehensive mock evidence for demo without API calls."""
47
- return [
48
- Evidence(
49
- content=(
50
- "Metformin, a first-line treatment for type 2 diabetes, activates "
51
- "AMP-activated protein kinase (AMPK). AMPK is a master metabolic "
52
- "regulator that inhibits mTOR signaling, reducing protein synthesis "
53
- "and cell proliferation. This mechanism has implications beyond "
54
- "glucose control."
55
- ),
56
- citation=Citation(
57
- source="pubmed",
58
- title="Metformin activates AMPK through LKB1-dependent mechanisms",
59
- url="https://pubmed.ncbi.nlm.nih.gov/19001324/",
60
- date="2023-06",
61
- authors=["Zhang L", "Wang H", "Chen Y"],
62
- ),
63
- ),
64
- Evidence(
65
- content=(
66
- "In transgenic mouse models of Alzheimer's disease, metformin treatment "
67
- "reduced tau phosphorylation by 45% and decreased amyloid-beta plaque "
68
- "formation. Treated mice showed improved performance on Morris water "
69
- "maze tests, suggesting preserved spatial memory."
70
- ),
71
- citation=Citation(
72
- source="pubmed",
73
- title="Metformin ameliorates tau pathology in AD mouse models",
74
- url="https://pubmed.ncbi.nlm.nih.gov/31256789/",
75
- date="2024-01",
76
- authors=["Kim J", "Lee S", "Park M", "Tanaka K"],
77
- ),
78
- ),
79
- Evidence(
80
- content=(
81
- "A population-based cohort study of 100,000 diabetic patients found "
82
- "that metformin users had 35% lower risk of developing Alzheimer's "
83
- "disease compared to sulfonylurea users (HR=0.65, 95% CI: 0.58-0.73). "
84
- "The protective effect increased with duration of use."
85
- ),
86
- citation=Citation(
87
- source="pubmed",
88
- title="Metformin and dementia risk: UK Biobank analysis",
89
- url="https://pubmed.ncbi.nlm.nih.gov/34567890/",
90
- date="2023-09",
91
- authors=["Smith A", "Johnson B", "Williams C"],
92
- ),
93
- ),
94
- Evidence(
95
- content=(
96
- "mTOR hyperactivation is observed in Alzheimer's disease brain tissue. "
97
- "mTOR inhibition by rapamycin or metformin promotes autophagy, which "
98
- "clears misfolded proteins including tau and amyloid-beta aggregates. "
99
- "This suggests a common therapeutic pathway."
100
- ),
101
- citation=Citation(
102
- source="pubmed",
103
- title="mTOR-autophagy axis in neurodegeneration",
104
- url="https://pubmed.ncbi.nlm.nih.gov/32109876/",
105
- date="2023-03",
106
- authors=["Brown C", "Davis D", "Miller E"],
107
- ),
108
- ),
109
- Evidence(
110
- content=(
111
- "Metformin crosses the blood-brain barrier via organic cation "
112
- "transporters (OCT1, OCT2). CSF concentrations reach approximately "
113
- "1-2% of plasma levels, sufficient for AMPK activation in neurons. "
114
- "Brain accumulation is observed in hippocampus and prefrontal cortex."
115
- ),
116
- citation=Citation(
117
- source="pubmed",
118
- title="Brain pharmacokinetics of metformin in humans",
119
- url="https://pubmed.ncbi.nlm.nih.gov/35678901/",
120
- date="2024-02",
121
- authors=["Wilson E", "Garcia F"],
122
- ),
123
- ),
124
- Evidence(
125
- content=(
126
- "Phase 2 clinical trial (NCT04098666) showed metformin 2000mg/day "
127
- "for 12 months slowed cognitive decline by 18% compared to placebo "
128
- "in patients with mild cognitive impairment. Biomarker analysis "
129
- "showed reduced CSF tau levels in the treatment group."
130
- ),
131
- citation=Citation(
132
- source="web",
133
- title="Metformin for Alzheimer's prevention trial results",
134
- url="https://clinicaltrials.gov/ct2/show/NCT04098666",
135
- date="2024-03",
136
- authors=["NIH Clinical Center"],
137
- ),
138
- ),
139
- ]
140
-
141
-
142
- def create_mock_hypotheses() -> list[MechanismHypothesis]:
143
- """Create mock hypotheses for demonstration."""
144
- return [
145
- MechanismHypothesis(
146
- drug="Metformin",
147
- target="AMPK",
148
- pathway="mTOR inhibition -> Autophagy activation",
149
- effect="Clearance of tau and amyloid-beta aggregates",
150
- confidence=0.85,
151
- supporting_evidence=[
152
- "https://pubmed.ncbi.nlm.nih.gov/19001324/",
153
- "https://pubmed.ncbi.nlm.nih.gov/32109876/",
154
- ],
155
- contradicting_evidence=[],
156
- search_suggestions=[
157
- "AMPK autophagy neurodegeneration",
158
- "metformin tau clearance",
159
- ],
160
- ),
161
- MechanismHypothesis(
162
- drug="Metformin",
163
- target="Glucose metabolism",
164
- pathway="Improved neuronal energy homeostasis",
165
- effect="Reduced oxidative stress and neuroinflammation",
166
- confidence=0.70,
167
- supporting_evidence=["https://pubmed.ncbi.nlm.nih.gov/31256789/"],
168
- contradicting_evidence=[],
169
- search_suggestions=[
170
- "metformin brain glucose metabolism",
171
- "neuronal insulin resistance alzheimer",
172
- ],
173
- ),
174
- ]
175
-
176
-
177
- async def run_mock_demo() -> None:
178
- """Run full pipeline with mock data (no API keys needed)."""
179
- print_header("DeepCritical Full Stack Demo (MOCK MODE)")
180
- print("Running with synthetic data - no API keys required.\n")
181
-
182
- evidence = create_mock_evidence()
183
- hypotheses = create_mock_hypotheses()
184
-
185
- # Step 1: Show evidence
186
- print_step(1, "SEARCH (Phase 2) - Evidence Collection")
187
- print(f"Collected {len(evidence)} pieces of evidence:\n")
188
- for i, e in enumerate(evidence, 1):
189
- print(f" [{i}] {e.citation.source.upper()}: {e.citation.title[:50]}...")
190
- print(f" {e.content[:80]}...")
191
- print()
192
 
193
- # Step 2: Embedding deduplication
194
- print_step(2, "EMBEDDINGS (Phase 6) - Semantic Deduplication")
195
- try:
196
- from src.services.embeddings import EmbeddingService
197
-
198
- service = EmbeddingService()
199
- unique = await service.deduplicate(evidence, threshold=0.85)
200
- print(f"Original: {len(evidence)} papers")
201
- print(f"After deduplication: {len(unique)} unique papers")
202
- print("(Semantic duplicates removed by meaning, not just URL)")
203
- except ImportError:
204
- print("Embedding dependencies not installed - skipping deduplication")
205
- unique = evidence
206
-
207
- # Step 3: Hypothesis generation
208
- print_step(3, "HYPOTHESIS (Phase 7) - Mechanistic Reasoning")
209
- print(f"Generated {len(hypotheses)} hypotheses:\n")
210
- for i, h in enumerate(hypotheses, 1):
211
- print(f" Hypothesis {i} (Confidence: {h.confidence:.0%})")
212
- print(f" {h.drug} -> {h.target} -> {h.pathway} -> {h.effect}")
213
- print(f" Suggested searches: {', '.join(h.search_suggestions)}")
214
- print()
215
 
216
- # Step 4: Judge assessment
217
- print_step(4, "JUDGE (Phase 3) - Evidence Assessment")
218
- print("Assessment Results:")
219
- print(" Mechanism Score: 8/10 (Strong mechanistic evidence)")
220
- print(" Clinical Score: 7/10 (Phase 2 trial + observational data)")
221
- print(" Confidence: 75%")
222
- print(" Recommendation: SYNTHESIZE (Evidence sufficient)")
223
- print()
224
-
225
- # Step 5: Report generation
226
- print_step(5, "REPORT (Phase 8) - Structured Scientific Report")
227
-
228
- report = f"""
229
- # Drug Repurposing Analysis: Metformin for Alzheimer's Disease
230
-
231
- ## Executive Summary
232
- This analysis evaluated metformin as a potential therapeutic for Alzheimer's
233
- disease. Evidence from {len(unique)} sources supports a plausible mechanism
234
- through AMPK activation and mTOR inhibition, leading to enhanced autophagy
235
- and clearance of pathological protein aggregates. Clinical data shows
236
- promising risk reduction in observational studies and early trial results.
237
-
238
- ## Research Question
239
- Can metformin, a type 2 diabetes medication, be repurposed for the prevention
240
- or treatment of Alzheimer's disease?
241
-
242
- ## Methodology
243
- - Searched PubMed and web sources for "metformin Alzheimer's disease"
244
- - Applied semantic deduplication to remove redundant findings
245
- - Generated mechanistic hypotheses using LLM reasoning
246
- - Evaluated evidence quality with structured assessment
247
-
248
- ## Hypotheses Tested
249
- - **Metformin -> AMPK -> mTOR inhibition -> Neuroprotection** (SUPPORTED)
250
- - 4 supporting papers, 0 contradicting
251
- - **Metformin -> Glucose metabolism -> Reduced oxidative stress** (PARTIAL)
252
- - 2 supporting papers, requires more investigation
253
-
254
- ## Mechanistic Findings
255
- Strong evidence supports AMPK activation as the primary mechanism. Metformin
256
- crosses the blood-brain barrier and achieves therapeutic concentrations in
257
- hippocampus and cortex. Downstream effects include:
258
- - mTOR inhibition
259
- - Autophagy activation
260
- - Tau dephosphorylation
261
- - Amyloid-beta clearance
262
-
263
- ## Clinical Findings
264
- - Observational: 35% risk reduction (HR=0.65, n=100,000)
265
- - Preclinical: 45% reduction in tau phosphorylation in AD mice
266
- - Phase 2 trial: 18% slower cognitive decline vs placebo
267
-
268
- ## Drug Candidates
269
- - **Metformin** - Primary candidate with established safety profile
270
-
271
- ## Limitations
272
- - Abstract-level analysis only
273
- - Observational data subject to confounding
274
- - Limited RCT data available
275
- - Optimal dosing for neuroprotection unclear
276
-
277
- ## Conclusion
278
- Metformin shows strong potential for Alzheimer's disease prevention/treatment.
279
- The AMPK-mTOR-autophagy mechanism is well-supported. Recommend Phase 3 trials
280
- with cognitive endpoints.
281
-
282
- ## References
283
- """
284
- max_authors_display = 2
285
- for i, e in enumerate(unique[:6], 1):
286
- authors = ", ".join(e.citation.authors[:max_authors_display])
287
- if len(e.citation.authors) > max_authors_display:
288
- authors += " et al."
289
- ref_line = (
290
- f"{i}. {authors}. *{e.citation.title}*. "
291
- f"{e.citation.source.upper()} ({e.citation.date}). "
292
- f"[Link]({e.citation.url})"
293
- )
294
- report += ref_line + "\n"
295
-
296
- report += f"""
297
- ---
298
- *Report generated from {len(unique)} papers across 3 search iterations.
299
- Confidence: 75%*
300
- """
301
-
302
- print(report)
303
 
304
 
305
  async def _run_search_iteration(
@@ -328,12 +77,12 @@ async def _run_search_iteration(
328
  return all_evidence
329
 
330
 
331
- async def run_real_demo(query: str, max_iterations: int) -> None:
332
- """Run full pipeline with real API calls."""
333
- print_header("DeepCritical Full Stack Demo")
334
  print(f"Query: {query}")
335
  print(f"Max iterations: {max_iterations}")
336
- print("Mode: REAL (Live API calls)\n")
337
 
338
  # Import real components
339
  from src.agent_factory.judges import JudgeHandler
@@ -344,7 +93,8 @@ async def run_real_demo(query: str, max_iterations: int) -> None:
344
  from src.tools.search_handler import SearchHandler
345
  from src.tools.websearch import WebTool
346
 
347
- # Initialize services
 
348
  embedding_service = EmbeddingService()
349
  search_handler = SearchHandler(tools=[PubMedTool(), WebTool()], timeout=30.0)
350
  judge_handler = JudgeHandler()
@@ -356,42 +106,47 @@ async def run_real_demo(query: str, max_iterations: int) -> None:
356
  for iteration in range(1, max_iterations + 1):
357
  print_step(iteration, f"ITERATION {iteration}/{max_iterations}")
358
 
359
- # Step 1: Search
360
- print("\n[Search] Querying PubMed and Web...")
361
  all_evidence = await _run_search_iteration(
362
  query, iteration, evidence_store, all_evidence, search_handler, embedding_service
363
  )
364
 
365
- # Step 2: Generate hypotheses (first iteration only)
 
 
 
 
366
  if iteration == 1:
367
- print("\n[Hypothesis] Generating mechanistic hypotheses...")
368
  hypothesis_agent = HypothesisAgent(evidence_store, embedding_service)
369
  hyp_response = await hypothesis_agent.run(query)
370
- print(hyp_response.messages[0].text[:500] + "...")
371
 
372
- # Step 3: Judge
373
- print("\n[Judge] Assessing evidence quality...")
374
  assessment = await judge_handler.assess(query, all_evidence)
375
- print(f" Mechanism: {assessment.details.mechanism_score}/10")
376
- print(f" Clinical: {assessment.details.clinical_evidence_score}/10")
377
- print(f" Recommendation: {assessment.recommendation}")
 
378
 
379
  if assessment.recommendation == "synthesize":
380
- print("\n[Judge says] Evidence sufficient! Generating report...")
381
  evidence_store["last_assessment"] = assessment.details.model_dump()
382
  break
383
 
384
  next_queries = assessment.next_search_queries[:2]
385
- print(f"\n[Judge says] Need more evidence. Next queries: {next_queries}")
386
  query = assessment.next_search_queries[0] if assessment.next_search_queries else query
387
 
388
- # Step 4: Generate report
389
- print_step(iteration + 1, "REPORT GENERATION")
390
  report_agent = ReportAgent(evidence_store, embedding_service)
391
  report_response = await report_agent.run(query)
392
 
393
  print("\n" + "=" * 70)
394
- print("FINAL RESEARCH REPORT")
395
  print("=" * 70)
396
  print(report_response.messages[0].text)
397
 
@@ -399,30 +154,25 @@ async def run_real_demo(query: str, max_iterations: int) -> None:
399
  async def main() -> None:
400
  """Entry point."""
401
  parser = argparse.ArgumentParser(
402
- description="DeepCritical Full Stack Demo (Phases 1-8)",
403
  formatter_class=argparse.RawDescriptionHelpFormatter,
404
  epilog="""
405
- Examples:
406
- # Mock mode (no API keys)
407
- uv run python examples/full_stack_demo/run_full.py --mock
408
-
409
- # Real mode with metformin query
410
- uv run python examples/full_stack_demo/run_full.py "metformin alzheimer"
411
 
412
- # Sildenafil for heart failure
 
413
  uv run python examples/full_stack_demo/run_full.py "sildenafil heart failure" -i 3
 
414
  """,
415
  )
416
  parser.add_argument(
417
  "query",
418
- nargs="?",
419
- default="metformin Alzheimer's disease",
420
- help="Research query",
421
- )
422
- parser.add_argument(
423
- "--mock",
424
- action="store_true",
425
- help="Run with mock data (no API keys needed)",
426
  )
427
  parser.add_argument(
428
  "-i",
@@ -434,21 +184,29 @@ Examples:
434
 
435
  args = parser.parse_args()
436
 
437
- if args.mock:
438
- await run_mock_demo()
439
- else:
440
- # Check for API keys
441
- if not (os.getenv("OPENAI_API_KEY") or os.getenv("ANTHROPIC_API_KEY")):
442
- print("Error: Real mode requires OPENAI_API_KEY or ANTHROPIC_API_KEY")
443
- print("Use --mock for demo without API keys.")
444
- sys.exit(1)
 
 
 
 
445
 
446
- await run_real_demo(args.query, args.iterations)
447
 
448
  print("\n" + "=" * 70)
449
  print(" DeepCritical Full Stack Demo Complete!")
450
- print(" Phases demonstrated: Foundation -> Search -> Judge -> UI ->")
451
- print(" Magentic -> Embeddings -> Hypothesis -> Report")
 
 
 
 
452
  print("=" * 70 + "\n")
453
 
454
 
 
2
  """
3
  Demo: Full Stack DeepCritical Agent (Phases 1-8).
4
 
5
+ This script demonstrates the COMPLETE REAL drug repurposing research pipeline:
6
+ - Phase 2: REAL Search (PubMed + Web API calls)
7
+ - Phase 6: REAL Embeddings (sentence-transformers + ChromaDB)
8
+ - Phase 7: REAL Hypothesis (LLM mechanistic reasoning)
9
+ - Phase 3: REAL Judge (LLM evidence assessment)
10
+ - Phase 8: REAL Report (LLM structured scientific report)
11
+
12
+ NO MOCKS. NO FAKE DATA. REAL SCIENCE.
13
 
14
  Usage:
 
15
  uv run python examples/full_stack_demo/run_full.py "metformin Alzheimer's"
16
+ uv run python examples/full_stack_demo/run_full.py "sildenafil heart failure" -i 3
17
 
18
+ Requires: OPENAI_API_KEY or ANTHROPIC_API_KEY
 
 
 
 
19
  """
20
 
21
  import argparse
 
24
  import sys
25
  from typing import Any
26
 
27
+ from src.utils.models import Evidence
28
 
29
 
30
  def print_header(title: str) -> None:
 
40
  print("-" * 50)
41
 
42
 
43
+ _MAX_DISPLAY_LEN = 600
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
44
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
45
 
46
+ def _print_truncated(text: str) -> None:
47
+ """Print text, truncating if too long."""
48
+ if len(text) > _MAX_DISPLAY_LEN:
49
+ print(text[:_MAX_DISPLAY_LEN] + "\n... [truncated for display]")
50
+ else:
51
+ print(text)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
52
 
53
 
54
  async def _run_search_iteration(
 
77
  return all_evidence
78
 
79
 
80
+ async def run_full_demo(query: str, max_iterations: int) -> None:
81
+ """Run the REAL full stack pipeline."""
82
+ print_header("DeepCritical Full Stack Demo (REAL)")
83
  print(f"Query: {query}")
84
  print(f"Max iterations: {max_iterations}")
85
+ print("Mode: REAL (All live API calls - no mocks)\n")
86
 
87
  # Import real components
88
  from src.agent_factory.judges import JudgeHandler
 
93
  from src.tools.search_handler import SearchHandler
94
  from src.tools.websearch import WebTool
95
 
96
+ # Initialize REAL services
97
+ print("[Init] Loading embedding model...")
98
  embedding_service = EmbeddingService()
99
  search_handler = SearchHandler(tools=[PubMedTool(), WebTool()], timeout=30.0)
100
  judge_handler = JudgeHandler()
 
106
  for iteration in range(1, max_iterations + 1):
107
  print_step(iteration, f"ITERATION {iteration}/{max_iterations}")
108
 
109
+ # Step 1: REAL Search
110
+ print("\n[Search] Querying PubMed and Web (REAL API calls)...")
111
  all_evidence = await _run_search_iteration(
112
  query, iteration, evidence_store, all_evidence, search_handler, embedding_service
113
  )
114
 
115
+ if not all_evidence:
116
+ print("\nNo evidence found. Try a different query.")
117
+ return
118
+
119
+ # Step 2: REAL Hypothesis generation (first iteration only)
120
  if iteration == 1:
121
+ print("\n[Hypothesis] Generating mechanistic hypotheses (REAL LLM)...")
122
  hypothesis_agent = HypothesisAgent(evidence_store, embedding_service)
123
  hyp_response = await hypothesis_agent.run(query)
124
+ _print_truncated(hyp_response.messages[0].text)
125
 
126
+ # Step 3: REAL Judge
127
+ print("\n[Judge] Assessing evidence quality (REAL LLM)...")
128
  assessment = await judge_handler.assess(query, all_evidence)
129
+ print(f" Mechanism Score: {assessment.details.mechanism_score}/10")
130
+ print(f" Clinical Score: {assessment.details.clinical_evidence_score}/10")
131
+ print(f" Confidence: {assessment.confidence:.0%}")
132
+ print(f" Recommendation: {assessment.recommendation.upper()}")
133
 
134
  if assessment.recommendation == "synthesize":
135
+ print("\n[Judge] Evidence sufficient! Proceeding to report generation...")
136
  evidence_store["last_assessment"] = assessment.details.model_dump()
137
  break
138
 
139
  next_queries = assessment.next_search_queries[:2]
140
+ print(f"\n[Judge] Need more evidence. Next queries: {next_queries}")
141
  query = assessment.next_search_queries[0] if assessment.next_search_queries else query
142
 
143
+ # Step 4: REAL Report generation
144
+ print_step(iteration + 1, "REPORT GENERATION (REAL LLM)")
145
  report_agent = ReportAgent(evidence_store, embedding_service)
146
  report_response = await report_agent.run(query)
147
 
148
  print("\n" + "=" * 70)
149
+ print(" FINAL RESEARCH REPORT")
150
  print("=" * 70)
151
  print(report_response.messages[0].text)
152
 
 
154
  async def main() -> None:
155
  """Entry point."""
156
  parser = argparse.ArgumentParser(
157
+ description="DeepCritical Full Stack Demo - REAL, No Mocks",
158
  formatter_class=argparse.RawDescriptionHelpFormatter,
159
  epilog="""
160
+ This demo runs the COMPLETE pipeline with REAL API calls:
161
+ 1. REAL search: Actual PubMed + DuckDuckGo queries
162
+ 2. REAL embeddings: Actual sentence-transformers model
163
+ 3. REAL hypothesis: Actual LLM generating mechanistic chains
164
+ 4. REAL judge: Actual LLM assessing evidence quality
165
+ 5. REAL report: Actual LLM generating structured report
166
 
167
+ Examples:
168
+ uv run python examples/full_stack_demo/run_full.py "metformin Alzheimer's"
169
  uv run python examples/full_stack_demo/run_full.py "sildenafil heart failure" -i 3
170
+ uv run python examples/full_stack_demo/run_full.py "aspirin cancer prevention"
171
  """,
172
  )
173
  parser.add_argument(
174
  "query",
175
+ help="Research query (e.g., 'metformin Alzheimer's disease')",
 
 
 
 
 
 
 
176
  )
177
  parser.add_argument(
178
  "-i",
 
184
 
185
  args = parser.parse_args()
186
 
187
+ # Fail fast: require API key
188
+ if not (os.getenv("OPENAI_API_KEY") or os.getenv("ANTHROPIC_API_KEY")):
189
+ print("=" * 70)
190
+ print("ERROR: This demo requires a real LLM.")
191
+ print()
192
+ print("Set one of the following in your .env file:")
193
+ print(" OPENAI_API_KEY=sk-...")
194
+ print(" ANTHROPIC_API_KEY=sk-ant-...")
195
+ print()
196
+ print("This is a REAL demo. No mocks. No fake data.")
197
+ print("=" * 70)
198
+ sys.exit(1)
199
 
200
+ await run_full_demo(args.query, args.iterations)
201
 
202
  print("\n" + "=" * 70)
203
  print(" DeepCritical Full Stack Demo Complete!")
204
+ print(" ")
205
+ print(" Everything you just saw was REAL:")
206
+ print(" - Real PubMed/Web searches")
207
+ print(" - Real embedding computations")
208
+ print(" - Real LLM reasoning")
209
+ print(" - Real scientific report")
210
  print("=" * 70 + "\n")
211
 
212
 
examples/hypothesis_demo/run_hypothesis.py CHANGED
@@ -2,17 +2,15 @@
2
  """
3
  Demo: Hypothesis Generation (Phase 7).
4
 
5
- This script demonstrates mechanistic hypothesis generation:
6
- - Drug -> Target -> Pathway -> Effect reasoning
7
- - Knowledge gap identification
8
- - Search query suggestions for targeted research
9
 
10
  Usage:
11
  # Requires OPENAI_API_KEY or ANTHROPIC_API_KEY
12
- uv run python examples/hypothesis_demo/run_hypothesis.py
13
-
14
- # With custom drug query
15
- uv run python examples/hypothesis_demo/run_hypothesis.py "aspirin heart disease"
16
  """
17
 
18
  import argparse
@@ -22,200 +20,110 @@ import sys
22
  from typing import Any
23
 
24
  from src.agents.hypothesis_agent import HypothesisAgent
25
- from src.utils.models import Citation, Evidence
26
-
27
-
28
- def create_metformin_evidence() -> list[Evidence]:
29
- """Create sample evidence about metformin for hypothesis generation."""
30
- return [
31
- Evidence(
32
- content=(
33
- "Metformin activates AMP-activated protein kinase (AMPK), a master regulator "
34
- "of cellular energy homeostasis. AMPK activation leads to inhibition of mTOR "
35
- "signaling, reducing protein synthesis and cell proliferation."
36
- ),
37
- citation=Citation(
38
- source="pubmed",
39
- title="Metformin and AMPK: mechanisms of action",
40
- url="https://pubmed.ncbi.nlm.nih.gov/12345/",
41
- date="2023",
42
- authors=["Zhang L", "Wang H"],
43
- ),
44
- ),
45
- Evidence(
46
- content=(
47
- "In Alzheimer's disease models, AMPK activation by metformin reduced tau "
48
- "phosphorylation and amyloid-beta accumulation. These effects correlated "
49
- "with improved cognitive function in transgenic mice."
50
- ),
51
- citation=Citation(
52
- source="pubmed",
53
- title="Metformin neuroprotective effects in AD models",
54
- url="https://pubmed.ncbi.nlm.nih.gov/23456/",
55
- date="2024",
56
- authors=["Kim J", "Lee S", "Park M"],
57
- ),
58
- ),
59
- Evidence(
60
- content=(
61
- "Clinical observational studies show diabetic patients on metformin have "
62
- "30-40% reduced incidence of Alzheimer's disease compared to those on "
63
- "other diabetes medications."
64
- ),
65
- citation=Citation(
66
- source="pubmed",
67
- title="Metformin use and dementia risk: population study",
68
- url="https://pubmed.ncbi.nlm.nih.gov/34567/",
69
- date="2023",
70
- authors=["Smith A", "Johnson B"],
71
- ),
72
- ),
73
- Evidence(
74
- content=(
75
- "mTOR inhibition has emerged as a key therapeutic target in neurodegenerative "
76
- "diseases. Rapamycin and metformin both reduce mTOR activity, though through "
77
- "different upstream mechanisms."
78
- ),
79
- citation=Citation(
80
- source="pubmed",
81
- title="mTOR pathway in neurodegeneration",
82
- url="https://pubmed.ncbi.nlm.nih.gov/45678/",
83
- date="2022",
84
- authors=["Brown C", "Davis D"],
85
- ),
86
- ),
87
- Evidence(
88
- content=(
89
- "Metformin crosses the blood-brain barrier and accumulates in the hippocampus "
90
- "and cortex. Brain concentrations sufficient for AMPK activation are achieved "
91
- "at standard diabetic doses."
92
- ),
93
- citation=Citation(
94
- source="pubmed",
95
- title="Pharmacokinetics of metformin in brain tissue",
96
- url="https://pubmed.ncbi.nlm.nih.gov/56789/",
97
- date="2023",
98
- authors=["Wilson E"],
99
- ),
100
- ),
101
- ]
102
-
103
-
104
- def create_sildenafil_evidence() -> list[Evidence]:
105
- """Create sample evidence about sildenafil (Viagra) for hypothesis generation."""
106
- return [
107
- Evidence(
108
- content=(
109
- "Sildenafil inhibits phosphodiesterase type 5 (PDE5), preventing breakdown "
110
- "of cGMP. Elevated cGMP causes smooth muscle relaxation and vasodilation "
111
- "in pulmonary vasculature."
112
- ),
113
- citation=Citation(
114
- source="pubmed",
115
- title="PDE5 inhibition mechanism of sildenafil",
116
- url="https://pubmed.ncbi.nlm.nih.gov/67890/",
117
- date="2022",
118
- authors=["Miller F"],
119
- ),
120
- ),
121
- Evidence(
122
- content=(
123
- "In pulmonary arterial hypertension (PAH), sildenafil reduces pulmonary "
124
- "vascular resistance and improves exercise capacity. FDA approved for PAH "
125
- "under brand name Revatio."
126
- ),
127
- citation=Citation(
128
- source="pubmed",
129
- title="Sildenafil in pulmonary hypertension treatment",
130
- url="https://pubmed.ncbi.nlm.nih.gov/78901/",
131
- date="2023",
132
- authors=["Garcia R", "Martinez L"],
133
- ),
134
- ),
135
- Evidence(
136
- content=(
137
- "PDE5 is expressed in cardiac myocytes. Sildenafil has shown cardioprotective "
138
- "effects in animal models of heart failure by enhancing nitric oxide-cGMP "
139
- "signaling in the myocardium."
140
- ),
141
- citation=Citation(
142
- source="pubmed",
143
- title="Cardiac effects of PDE5 inhibition",
144
- url="https://pubmed.ncbi.nlm.nih.gov/89012/",
145
- date="2024",
146
- authors=["Thompson K"],
147
- ),
148
- ),
149
- ]
150
 
151
 
152
  async def run_hypothesis_demo(query: str) -> None:
153
- """Run the hypothesis generation demo."""
154
  print(f"\n{'='*60}")
155
  print("DeepCritical Hypothesis Agent Demo (Phase 7)")
156
  print(f"Query: {query}")
 
157
  print(f"{'='*60}\n")
158
 
159
- # Select appropriate evidence based on query
160
- if "sildenafil" in query.lower() or "viagra" in query.lower():
161
- evidence = create_sildenafil_evidence()
162
- print("Using: Sildenafil evidence set (3 papers)")
163
- else:
164
- evidence = create_metformin_evidence()
165
- print("Using: Metformin evidence set (5 papers)")
166
-
167
- # Create evidence store (shared context between agents)
168
- evidence_store: dict[str, Any] = {"current": evidence, "hypotheses": []}
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
169
 
170
- # Create hypothesis agent
171
- agent = HypothesisAgent(evidence_store)
172
-
173
- print("\nGenerating mechanistic hypotheses...\n")
174
  print("-" * 60)
175
-
176
- # Run hypothesis generation
177
  response = await agent.run(query)
178
-
179
- # Print the formatted response
180
  print(response.messages[0].text)
181
-
182
  print("-" * 60)
183
 
184
  # Show stored hypotheses
185
  hypotheses = evidence_store.get("hypotheses", [])
186
- print(f"\n{len(hypotheses)} hypotheses stored in evidence_store")
187
 
188
  if hypotheses:
189
- print("\nHypothesis search queries generated:")
190
  for h in hypotheses:
191
  queries = h.to_search_queries()
192
- print(f" - {h.drug} -> {h.target}: {queries[:2]}")
 
 
193
 
194
 
195
  async def main() -> None:
196
- """Run the demo."""
197
- parser = argparse.ArgumentParser(description="Hypothesis Generation Demo")
 
 
 
 
 
 
 
 
 
198
  parser.add_argument(
199
  "query",
200
  nargs="?",
201
  default="metformin Alzheimer's disease",
202
- help="Research query (default: 'metformin Alzheimer\\'s disease')",
203
  )
204
  args = parser.parse_args()
205
 
206
- # Check for API key
207
  if not (os.getenv("OPENAI_API_KEY") or os.getenv("ANTHROPIC_API_KEY")):
208
- print("Error: Hypothesis generation requires an LLM.")
209
- print("Set OPENAI_API_KEY or ANTHROPIC_API_KEY in your environment.")
 
 
 
 
 
 
 
210
  sys.exit(1)
211
 
212
  await run_hypothesis_demo(args.query)
213
 
214
  print("\n" + "=" * 60)
215
- print("Demo complete! The Hypothesis Agent:")
216
- print(" - Analyzes evidence to find Drug -> Target -> Pathway -> Effect chains")
217
- print(" - Identifies knowledge gaps in current evidence")
218
- print(" - Suggests targeted search queries to test hypotheses")
219
  print("=" * 60 + "\n")
220
 
221
 
 
2
  """
3
  Demo: Hypothesis Generation (Phase 7).
4
 
5
+ This script demonstrates the REAL hypothesis generation pipeline:
6
+ 1. REAL search: PubMed + Web (actual API calls)
7
+ 2. REAL embeddings: Semantic deduplication
8
+ 3. REAL LLM: Mechanistic hypothesis generation
9
 
10
  Usage:
11
  # Requires OPENAI_API_KEY or ANTHROPIC_API_KEY
12
+ uv run python examples/hypothesis_demo/run_hypothesis.py "metformin Alzheimer's"
13
+ uv run python examples/hypothesis_demo/run_hypothesis.py "sildenafil heart failure"
 
 
14
  """
15
 
16
  import argparse
 
20
  from typing import Any
21
 
22
  from src.agents.hypothesis_agent import HypothesisAgent
23
+ from src.services.embeddings import EmbeddingService
24
+ from src.tools.pubmed import PubMedTool
25
+ from src.tools.search_handler import SearchHandler
26
+ from src.tools.websearch import WebTool
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
27
 
28
 
29
  async def run_hypothesis_demo(query: str) -> None:
30
+ """Run the REAL hypothesis generation pipeline."""
31
  print(f"\n{'='*60}")
32
  print("DeepCritical Hypothesis Agent Demo (Phase 7)")
33
  print(f"Query: {query}")
34
+ print("Mode: REAL (Live API calls)")
35
  print(f"{'='*60}\n")
36
 
37
+ # Step 1: REAL Search
38
+ print("[Step 1] Searching PubMed + Web...")
39
+ search_handler = SearchHandler(tools=[PubMedTool(), WebTool()], timeout=30.0)
40
+ result = await search_handler.execute(query, max_results_per_tool=5)
41
+
42
+ print(f" Found {result.total_found} results from {result.sources_searched}")
43
+ if result.errors:
44
+ print(f" Warnings: {result.errors}")
45
+
46
+ if not result.evidence:
47
+ print("\nNo evidence found. Try a different query.")
48
+ return
49
+
50
+ # Step 2: REAL Embeddings - Deduplicate
51
+ print("\n[Step 2] Semantic deduplication...")
52
+ embedding_service = EmbeddingService()
53
+ unique_evidence = await embedding_service.deduplicate(result.evidence, threshold=0.85)
54
+ print(f" {len(result.evidence)} -> {len(unique_evidence)} unique papers")
55
+
56
+ # Show what we found
57
+ print("\n[Evidence collected]")
58
+ max_title_len = 50
59
+ for i, e in enumerate(unique_evidence[:5], 1):
60
+ raw_title = e.citation.title
61
+ title = raw_title[:max_title_len] + "..." if len(raw_title) > max_title_len else raw_title
62
+ print(f" {i}. [{e.citation.source.upper()}] {title}")
63
+
64
+ # Step 3: REAL LLM - Generate hypotheses
65
+ print("\n[Step 3] Generating mechanistic hypotheses (LLM)...")
66
+ evidence_store: dict[str, Any] = {"current": unique_evidence, "hypotheses": []}
67
+ agent = HypothesisAgent(evidence_store, embedding_service)
68
 
 
 
 
 
69
  print("-" * 60)
 
 
70
  response = await agent.run(query)
 
 
71
  print(response.messages[0].text)
 
72
  print("-" * 60)
73
 
74
  # Show stored hypotheses
75
  hypotheses = evidence_store.get("hypotheses", [])
76
+ print(f"\n{len(hypotheses)} hypotheses stored")
77
 
78
  if hypotheses:
79
+ print("\nGenerated search queries for further investigation:")
80
  for h in hypotheses:
81
  queries = h.to_search_queries()
82
+ print(f" {h.drug} -> {h.target}:")
83
+ for q in queries[:3]:
84
+ print(f" - {q}")
85
 
86
 
87
  async def main() -> None:
88
+ """Entry point."""
89
+ parser = argparse.ArgumentParser(
90
+ description="Hypothesis Generation Demo (REAL - No Mocks)",
91
+ formatter_class=argparse.RawDescriptionHelpFormatter,
92
+ epilog="""
93
+ Examples:
94
+ uv run python examples/hypothesis_demo/run_hypothesis.py "metformin Alzheimer's"
95
+ uv run python examples/hypothesis_demo/run_hypothesis.py "sildenafil heart failure"
96
+ uv run python examples/hypothesis_demo/run_hypothesis.py "aspirin cancer prevention"
97
+ """,
98
+ )
99
  parser.add_argument(
100
  "query",
101
  nargs="?",
102
  default="metformin Alzheimer's disease",
103
+ help="Research query",
104
  )
105
  args = parser.parse_args()
106
 
107
+ # Fail fast: require API key
108
  if not (os.getenv("OPENAI_API_KEY") or os.getenv("ANTHROPIC_API_KEY")):
109
+ print("=" * 60)
110
+ print("ERROR: This demo requires a real LLM.")
111
+ print()
112
+ print("Set one of the following in your .env file:")
113
+ print(" OPENAI_API_KEY=sk-...")
114
+ print(" ANTHROPIC_API_KEY=sk-ant-...")
115
+ print()
116
+ print("This is a REAL demo, not a mock. No fake data.")
117
+ print("=" * 60)
118
  sys.exit(1)
119
 
120
  await run_hypothesis_demo(args.query)
121
 
122
  print("\n" + "=" * 60)
123
+ print("Demo complete! This was a REAL pipeline:")
124
+ print(" 1. REAL search: Actual PubMed + Web API calls")
125
+ print(" 2. REAL embeddings: Actual sentence-transformers")
126
+ print(" 3. REAL LLM: Actual hypothesis generation")
127
  print("=" * 60 + "\n")
128
 
129
 
examples/orchestrator_demo/run_agent.py CHANGED
@@ -1,19 +1,20 @@
1
  #!/usr/bin/env python3
2
  """
3
- Demo: Full DeepCritical Agent Loop (Search + Judge + Orchestrator).
4
 
5
- This script demonstrates Phase 4 functionality:
6
- - Iterative Search (PubMed + Web)
7
- - Evidence Evaluation (Judge Agent)
8
- - Orchestration Loop
9
- - Final Synthesis
10
 
11
- Usage:
12
- # Run with Mock Judge (No API Key needed)
13
- uv run python examples/orchestrator_demo/run_agent.py "metformin cancer" --mock
14
 
15
- # Run with Real Judge (Requires OPENAI_API_KEY or ANTHROPIC_API_KEY)
16
  uv run python examples/orchestrator_demo/run_agent.py "metformin cancer"
 
 
 
17
  """
18
 
19
  import argparse
@@ -21,7 +22,7 @@ import asyncio
21
  import os
22
  import sys
23
 
24
- from src.agent_factory.judges import JudgeHandler, MockJudgeHandler
25
  from src.orchestrator import Orchestrator
26
  from src.tools.pubmed import PubMedTool
27
  from src.tools.search_handler import SearchHandler
@@ -30,52 +31,75 @@ from src.utils.models import OrchestratorConfig
30
 
31
 
32
  async def main() -> None:
33
- """Run the agent demo."""
34
- parser = argparse.ArgumentParser(description="Run DeepCritical Agent CLI")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
  parser.add_argument("query", help="Research query (e.g., 'metformin cancer')")
36
- parser.add_argument("--mock", action="store_true", help="Use Mock Judge (no API key needed)")
37
- parser.add_argument("--iterations", type=int, default=3, help="Max iterations")
38
  args = parser.parse_args()
39
 
40
- # Check for keys if not mocking
41
- if not args.mock and not (os.getenv("OPENAI_API_KEY") or os.getenv("ANTHROPIC_API_KEY")):
42
- print("Error: No API key found. Set OPENAI_API_KEY or ANTHROPIC_API_KEY, or use --mock.")
 
 
 
 
 
 
 
 
43
  sys.exit(1)
44
 
45
  print(f"\n{'='*60}")
46
- print("DeepCritical Agent Demo")
47
  print(f"Query: {args.query}")
48
- print(f"Mode: {'MOCK' if args.mock else 'REAL (LLM)'}")
49
- print(f"{ '='*60}\n")
 
50
 
51
- # 1. Setup Search Tools
52
  search_handler = SearchHandler(tools=[PubMedTool(), WebTool()], timeout=30.0)
 
53
 
54
- # 2. Setup Judge
55
- judge_handler: JudgeHandler | MockJudgeHandler
56
- if args.mock:
57
- judge_handler = MockJudgeHandler()
58
- else:
59
- judge_handler = JudgeHandler()
60
-
61
- # 3. Setup Orchestrator
62
  config = OrchestratorConfig(max_iterations=args.iterations)
63
  orchestrator = Orchestrator(
64
  search_handler=search_handler, judge_handler=judge_handler, config=config
65
  )
66
 
67
- # 4. Run Loop
68
  try:
69
  async for event in orchestrator.run(args.query):
70
- # Print event with icon
71
  print(event.to_markdown().replace("**", ""))
72
 
73
- # If we got data, print a snippet
74
  if event.type == "search_complete" and event.data:
75
  print(f" -> Found {event.data.get('new_count', 0)} new items")
76
 
77
  except Exception as e:
78
  print(f"\n❌ Error: {e}")
 
 
 
 
 
 
 
 
79
 
80
 
81
  if __name__ == "__main__":
 
1
  #!/usr/bin/env python3
2
  """
3
+ Demo: DeepCritical Agent Loop (Search + Judge + Orchestrator).
4
 
5
+ This script demonstrates the REAL Phase 4 orchestration:
6
+ - REAL Iterative Search (PubMed + Web API calls)
7
+ - REAL Evidence Evaluation (LLM Judge)
8
+ - REAL Orchestration Loop
9
+ - REAL Final Synthesis
10
 
11
+ NO MOCKS. REAL API CALLS.
 
 
12
 
13
+ Usage:
14
  uv run python examples/orchestrator_demo/run_agent.py "metformin cancer"
15
+ uv run python examples/orchestrator_demo/run_agent.py "sildenafil heart failure" --iterations 5
16
+
17
+ Requires: OPENAI_API_KEY or ANTHROPIC_API_KEY
18
  """
19
 
20
  import argparse
 
22
  import os
23
  import sys
24
 
25
+ from src.agent_factory.judges import JudgeHandler
26
  from src.orchestrator import Orchestrator
27
  from src.tools.pubmed import PubMedTool
28
  from src.tools.search_handler import SearchHandler
 
31
 
32
 
33
  async def main() -> None:
34
+ """Run the REAL agent demo."""
35
+ parser = argparse.ArgumentParser(
36
+ description="DeepCritical Agent Demo - REAL, No Mocks",
37
+ formatter_class=argparse.RawDescriptionHelpFormatter,
38
+ epilog="""
39
+ This demo runs the REAL search-judge-synthesize loop:
40
+ 1. REAL search: Actual PubMed + DuckDuckGo queries
41
+ 2. REAL judge: Actual LLM assessing evidence quality
42
+ 3. REAL loop: Actual iterative refinement based on LLM decisions
43
+ 4. REAL synthesis: Actual research summary generation
44
+
45
+ Examples:
46
+ uv run python examples/orchestrator_demo/run_agent.py "metformin cancer"
47
+ uv run python examples/orchestrator_demo/run_agent.py "aspirin alzheimer" --iterations 5
48
+ """,
49
+ )
50
  parser.add_argument("query", help="Research query (e.g., 'metformin cancer')")
51
+ parser.add_argument("--iterations", type=int, default=3, help="Max iterations (default: 3)")
 
52
  args = parser.parse_args()
53
 
54
+ # Fail fast: require API key
55
+ if not (os.getenv("OPENAI_API_KEY") or os.getenv("ANTHROPIC_API_KEY")):
56
+ print("=" * 60)
57
+ print("ERROR: This demo requires a real LLM.")
58
+ print()
59
+ print("Set one of the following in your .env file:")
60
+ print(" OPENAI_API_KEY=sk-...")
61
+ print(" ANTHROPIC_API_KEY=sk-ant-...")
62
+ print()
63
+ print("This is a REAL demo. No mocks. No fake data.")
64
+ print("=" * 60)
65
  sys.exit(1)
66
 
67
  print(f"\n{'='*60}")
68
+ print("DeepCritical Agent Demo (REAL)")
69
  print(f"Query: {args.query}")
70
+ print(f"Max Iterations: {args.iterations}")
71
+ print("Mode: REAL (All live API calls)")
72
+ print(f"{'='*60}\n")
73
 
74
+ # Setup REAL components
75
  search_handler = SearchHandler(tools=[PubMedTool(), WebTool()], timeout=30.0)
76
+ judge_handler = JudgeHandler() # REAL LLM judge
77
 
 
 
 
 
 
 
 
 
78
  config = OrchestratorConfig(max_iterations=args.iterations)
79
  orchestrator = Orchestrator(
80
  search_handler=search_handler, judge_handler=judge_handler, config=config
81
  )
82
 
83
+ # Run the REAL loop
84
  try:
85
  async for event in orchestrator.run(args.query):
86
+ # Print event with icon (remove markdown bold for CLI)
87
  print(event.to_markdown().replace("**", ""))
88
 
89
+ # Show search results count
90
  if event.type == "search_complete" and event.data:
91
  print(f" -> Found {event.data.get('new_count', 0)} new items")
92
 
93
  except Exception as e:
94
  print(f"\n❌ Error: {e}")
95
+ raise
96
+
97
+ print("\n" + "=" * 60)
98
+ print("Demo complete! Everything was REAL:")
99
+ print(" - Real PubMed/Web searches")
100
+ print(" - Real LLM judge decisions")
101
+ print(" - Real iterative refinement")
102
+ print("=" * 60 + "\n")
103
 
104
 
105
  if __name__ == "__main__":