VibecoderMcSwaggins commited on
Commit
c7584c1
Β·
1 Parent(s): a496d9d

docs: add alternative frameworks analysis and stretch goals

Browse files

- Section 14: Why we chose Pydantic AI + FastMCP over AutoGen/Claude SDK
- Section 15: Gucci bangers to steal if shipping like animals
- Priority order for stretch goal integrations
- Post-hackathon roadmap for multi-agent expansion

Files changed (1) hide show
  1. docs/architecture/design-patterns.md +243 -1
docs/architecture/design-patterns.md CHANGED
@@ -1044,9 +1044,251 @@ class ResearchReport(BaseModel):
1044
 
1045
  ---
1046
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1047
  ---
1048
 
1049
  **Document Status**: Official Architecture Spec
1050
  **Review Score**: 99/100
1051
- **Sections**: 13 design patterns + data models appendix
1052
  **Last Updated**: November 2025
 
1044
 
1045
  ---
1046
 
1047
+ ## 14. Alternative Frameworks Considered
1048
+
1049
+ We researched major agent frameworks before settling on our stack. Here's why we chose what we chose, and what we'd steal if we're shipping like animals and have time for Gucci upgrades.
1050
+
1051
+ ### Frameworks Evaluated
1052
+
1053
+ | Framework | Repo | What It Does |
1054
+ |-----------|------|--------------|
1055
+ | **Microsoft AutoGen** | [github.com/microsoft/autogen](https://github.com/microsoft/autogen) | Multi-agent orchestration, complex workflows |
1056
+ | **Claude Agent SDK** | [github.com/anthropics/claude-agent-sdk-python](https://github.com/anthropics/claude-agent-sdk-python) | Anthropic's official agent framework |
1057
+ | **Pydantic AI** | [github.com/pydantic/pydantic-ai](https://github.com/pydantic/pydantic-ai) | Type-safe agents, structured outputs |
1058
+
1059
+ ### Why NOT AutoGen (Microsoft)?
1060
+
1061
+ **Pros:**
1062
+ - Battle-tested multi-agent orchestration
1063
+ - `reflect_on_tool_use` - model reviews its own tool results
1064
+ - `max_tool_iterations` - built-in iteration limits
1065
+ - Concurrent tool execution
1066
+ - Rich ecosystem (AutoGen Studio, benchmarks)
1067
+
1068
+ **Cons for MVP:**
1069
+ - Heavy dependency tree (50+ packages)
1070
+ - Complex configuration (YAML + Python)
1071
+ - Overkill for single-agent search-judge loop
1072
+ - Learning curve eats into 6-day timeline
1073
+
1074
+ **Verdict:** Great for multi-agent systems. Overkill for our MVP.
1075
+
1076
+ ### Why NOT Claude Agent SDK (Anthropic)?
1077
+
1078
+ **Pros:**
1079
+ - Official Anthropic framework
1080
+ - Clean `@tool` decorator pattern
1081
+ - In-process MCP servers (no subprocess)
1082
+ - Hooks for pre/post tool execution
1083
+ - Direct Claude Code integration
1084
+
1085
+ **Cons for MVP:**
1086
+ - Requires Claude Code CLI bundled
1087
+ - Node.js dependency for some features
1088
+ - Designed for Claude Code ecosystem, not standalone agents
1089
+ - Less flexible for custom LLM providers
1090
+
1091
+ **Verdict:** Would be great if we were building ON Claude Code. We're building a standalone agent.
1092
+
1093
+ ### Why Pydantic AI + FastMCP (Our Choice)
1094
+
1095
+ **Pros:**
1096
+ - βœ… Simple, Pythonic API
1097
+ - βœ… Native async/await
1098
+ - βœ… Type-safe with Pydantic
1099
+ - βœ… Works with any LLM provider
1100
+ - βœ… FastMCP for clean MCP servers
1101
+ - βœ… Minimal dependencies
1102
+ - βœ… Can ship MVP in 6 days
1103
+
1104
+ **Cons:**
1105
+ - Newer framework (less battle-tested)
1106
+ - Smaller ecosystem
1107
+ - May need to build more from scratch
1108
+
1109
+ **Verdict:** Right tool for the job. Ship fast, iterate later.
1110
+
1111
+ ---
1112
+
1113
+ ## 15. Stretch Goals: Gucci Bangers (If We're Shipping Like Animals)
1114
+
1115
+ If MVP ships early and we're crushing it, here's what we'd steal from other frameworks:
1116
+
1117
+ ### Tier 1: Quick Wins (2-4 hours each)
1118
+
1119
+ #### From Claude Agent SDK: `@tool` Decorator Pattern
1120
+ Replace our Protocol-based tools with cleaner decorators:
1121
+
1122
+ ```python
1123
+ # CURRENT (Protocol-based)
1124
+ class PubMedSearchTool:
1125
+ async def search(self, query: str, max_results: int = 10) -> List[Evidence]:
1126
+ ...
1127
+
1128
+ # UPGRADE (Decorator-based, stolen from Claude SDK)
1129
+ from claude_agent_sdk import tool
1130
+
1131
+ @tool("search_pubmed", "Search PubMed for biomedical papers", {
1132
+ "query": str,
1133
+ "max_results": int
1134
+ })
1135
+ async def search_pubmed(args):
1136
+ results = await _do_pubmed_search(args["query"], args["max_results"])
1137
+ return {"content": [{"type": "text", "text": json.dumps(results)}]}
1138
+ ```
1139
+
1140
+ **Why it's Gucci:** Cleaner syntax, automatic schema generation, less boilerplate.
1141
+
1142
+ #### From AutoGen: Reflect on Tool Use
1143
+ Add a reflection step where the model reviews its own tool results:
1144
+
1145
+ ```python
1146
+ # CURRENT: Judge evaluates evidence
1147
+ assessment = await judge.assess(question, evidence)
1148
+
1149
+ # UPGRADE: Add reflection step (stolen from AutoGen)
1150
+ class ReflectiveJudge:
1151
+ async def assess_with_reflection(self, question, evidence, tool_results):
1152
+ # First pass: raw assessment
1153
+ initial = await self._assess(question, evidence)
1154
+
1155
+ # Reflection: "Did I use the tools correctly?"
1156
+ reflection = await self._reflect_on_tool_use(tool_results)
1157
+
1158
+ # Final: combine assessment + reflection
1159
+ return self._combine(initial, reflection)
1160
+ ```
1161
+
1162
+ **Why it's Gucci:** Catches tool misuse, improves accuracy, more robust judge.
1163
+
1164
+ ### Tier 2: Medium Lifts (4-8 hours each)
1165
+
1166
+ #### From AutoGen: Concurrent Tool Execution
1167
+ Run multiple tools in parallel with proper error handling:
1168
+
1169
+ ```python
1170
+ # CURRENT: Sequential with asyncio.gather
1171
+ results = await asyncio.gather(*[tool.search(query) for tool in tools])
1172
+
1173
+ # UPGRADE: AutoGen-style with cancellation + timeout
1174
+ from autogen_core import CancellationToken
1175
+
1176
+ async def execute_tools_concurrent(tools, query, timeout=30):
1177
+ token = CancellationToken()
1178
+
1179
+ async def run_with_timeout(tool):
1180
+ try:
1181
+ return await asyncio.wait_for(
1182
+ tool.search(query, cancellation_token=token),
1183
+ timeout=timeout
1184
+ )
1185
+ except asyncio.TimeoutError:
1186
+ token.cancel() # Cancel other tools
1187
+ return ToolError(f"{tool.name} timed out")
1188
+
1189
+ return await asyncio.gather(*[run_with_timeout(t) for t in tools])
1190
+ ```
1191
+
1192
+ **Why it's Gucci:** Proper timeout handling, cancellation propagation, production-ready.
1193
+
1194
+ #### From Claude SDK: Hooks System
1195
+ Add pre/post hooks for logging, validation, cost tracking:
1196
+
1197
+ ```python
1198
+ # UPGRADE: Hook system (stolen from Claude SDK)
1199
+ class HookManager:
1200
+ async def pre_tool_use(self, tool_name, args):
1201
+ """Called before every tool execution"""
1202
+ logger.info(f"Calling {tool_name} with {args}")
1203
+ self.cost_tracker.start_timer()
1204
+
1205
+ async def post_tool_use(self, tool_name, result, duration):
1206
+ """Called after every tool execution"""
1207
+ self.cost_tracker.record(tool_name, duration)
1208
+ if result.is_error:
1209
+ self.error_tracker.record(tool_name, result.error)
1210
+ ```
1211
+
1212
+ **Why it's Gucci:** Observability, debugging, cost tracking, production-ready.
1213
+
1214
+ ### Tier 3: Big Lifts (Post-Hackathon)
1215
+
1216
+ #### Full AutoGen Integration
1217
+ If we want multi-agent capabilities later:
1218
+
1219
+ ```python
1220
+ # POST-HACKATHON: Multi-agent drug repurposing
1221
+ from autogen_agentchat import AssistantAgent, GroupChat
1222
+
1223
+ literature_agent = AssistantAgent(
1224
+ name="LiteratureReviewer",
1225
+ tools=[pubmed_search, web_search],
1226
+ system_message="You search and summarize medical literature."
1227
+ )
1228
+
1229
+ mechanism_agent = AssistantAgent(
1230
+ name="MechanismAnalyzer",
1231
+ tools=[pathway_db, protein_db],
1232
+ system_message="You analyze disease mechanisms and drug targets."
1233
+ )
1234
+
1235
+ synthesis_agent = AssistantAgent(
1236
+ name="ReportSynthesizer",
1237
+ system_message="You synthesize findings into actionable reports."
1238
+ )
1239
+
1240
+ # Orchestrate multi-agent workflow
1241
+ group_chat = GroupChat(
1242
+ agents=[literature_agent, mechanism_agent, synthesis_agent],
1243
+ max_round=10
1244
+ )
1245
+ ```
1246
+
1247
+ **Why it's Gucci:** True multi-agent collaboration, specialized roles, scalable.
1248
+
1249
+ ---
1250
+
1251
+ ## Priority Order for Stretch Goals
1252
+
1253
+ | Priority | Feature | Source | Effort | Impact |
1254
+ |----------|---------|--------|--------|--------|
1255
+ | 1 | `@tool` decorator | Claude SDK | 2 hrs | High - cleaner code |
1256
+ | 2 | Reflect on tool use | AutoGen | 3 hrs | High - better accuracy |
1257
+ | 3 | Hooks system | Claude SDK | 4 hrs | Medium - observability |
1258
+ | 4 | Concurrent + cancellation | AutoGen | 4 hrs | Medium - robustness |
1259
+ | 5 | Multi-agent | AutoGen | 8+ hrs | Post-hackathon |
1260
+
1261
+ ---
1262
+
1263
+ ## The Bottom Line
1264
+
1265
+ ```
1266
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
1267
+ β”‚ MVP (Days 1-4): Pydantic AI + FastMCP β”‚
1268
+ β”‚ - Ship working drug repurposing agent β”‚
1269
+ β”‚ - Search-judge loop with PubMed + Web β”‚
1270
+ β”‚ - Gradio UI with streaming β”‚
1271
+ β”‚ - MCP server for hackathon track β”‚
1272
+ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
1273
+ β”‚ If Crushing It (Days 5-6): Steal the Gucci β”‚
1274
+ β”‚ - @tool decorators from Claude SDK β”‚
1275
+ β”‚ - Reflect on tool use from AutoGen β”‚
1276
+ β”‚ - Hooks for observability β”‚
1277
+ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
1278
+ β”‚ Post-Hackathon: Full AutoGen Integration β”‚
1279
+ β”‚ - Multi-agent workflows β”‚
1280
+ β”‚ - Specialized agent roles β”‚
1281
+ β”‚ - Production-grade orchestration β”‚
1282
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
1283
+ ```
1284
+
1285
+ **Ship MVP first. Steal bangers if time. Scale later.**
1286
+
1287
+ ---
1288
+
1289
  ---
1290
 
1291
  **Document Status**: Official Architecture Spec
1292
  **Review Score**: 99/100
1293
+ **Sections**: 15 design patterns + data models appendix + stretch goals
1294
  **Last Updated**: November 2025