Spaces:

MCP-1st-Birthday
/

ALSARA

Running

App Files Files Community

axegameon commited on 11 days ago

Commit

3e435ad

verified ·

1 Parent(s): 6260c2f

Upload ALSARA app files (#1)

Browse files

- Upload ALSARA app files (2076784d2a441a06677136e33fa32aa2e9bacb72)

Files changed (23) hide show

.env.example +37 -0
README.md +469 -9
als_agent_app.py +1832 -0
custom_mcp_client.py +241 -0
llm_client.py +439 -0
llm_providers.py +357 -0
parallel_tool_execution.py +235 -0
query_classifier.py +202 -0
refactored_helpers.py +200 -0
requirements.txt +37 -0
servers/aact_server.py +472 -0
servers/biorxiv_server.py +440 -0
servers/clinicaltrials_links.py +245 -0
servers/elevenlabs_server.py +561 -0
servers/fetch_server.py +206 -0
servers/llamaindex_server.py +729 -0
servers/pubmed_server.py +269 -0
shared/__init__.py +34 -0
shared/cache.py +94 -0
shared/config.py +134 -0
shared/http_client.py +68 -0
shared/utils.py +194 -0
smart_cache.py +458 -0

.env.example ADDED Viewed

	@@ -0,0 +1,37 @@

+# ALS Research Agent Environment Configuration
+# Anthropic API Key (Required)
+ANTHROPIC_API_KEY=your_anthropic_api_key_here
+# Optional: Specify which Anthropic model to use
+# Available models: claude-sonnet-4-5-20250929, claude-opus-3-20240229, claude-3-haiku-20240307
+# Default: claude-sonnet-4-5-20250929
+ANTHROPIC_MODEL=claude-sonnet-4-5-20250929
+# Optional: Gradio server configuration
+GRADIO_SERVER_PORT=7860  # Default port for Gradio UI
+# ElevenLabs Configuration (Optional - for voice capabilities)
+ELEVENLABS_API_KEY=your_elevenlabs_api_key_here
+ELEVENLABS_VOICE_ID=21m00Tcm4TlvDq8ikWAM  # Rachel voice (clear for medical terms)
+# LlamaIndex RAG Configuration (Optional - for research memory)
+CHROMA_DB_PATH=./chroma_db  # Path to persist vector database
+LLAMAINDEX_EMBED_MODEL=dmis-lab/biobert-base-cased-v1.2  # Biomedical embedding model
+LLAMAINDEX_CHUNK_SIZE=1024  # Text chunk size for indexing
+LLAMAINDEX_CHUNK_OVERLAP=200  # Overlap between chunks
+# Optional: Show agent thinking process in UI
+SHOW_THINKING=false
+# Optional: LLM provider preference
+# Options: quality_optimize (best model), cost_optimize (cheaper model), auto (default)
+LLM_PROVIDER_PREFERENCE=auto
+# Research API Configuration (Optional)
+# Configure these if you want to limit API usage
+RATE_LIMIT_PUBMED_DELAY=1.0  # Delay between PubMed requests (seconds)
+RATE_LIMIT_BIORXIV_DELAY=1.0  # Delay between bioRxiv requests (seconds)
+# Optional: Max concurrent searches
+MAX_CONCURRENT_SEARCHES=3

README.md CHANGED Viewed

@@ -1,14 +1,474 @@
 ---
-title: ALSARA
-emoji: 🔥
-colorFrom: indigo
-colorTo: red
 sdk: gradio
-sdk_version: 6.0.1
-app_file: app.py
-pinned: false
 license: mit
-short_description: ALSARA is an agentic research assistant for ALS
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: ALSARA - ALS Agentic Research Agent
+emoji: 🧬
+colorFrom: blue
+colorTo: green
 sdk: gradio
+sdk_version: "6.0.0"
+app_file: als_agent_app.py
 license: mit
+short_description: AI research assistant for ALS research & trials
+pinned: false
+sponsors: Sambanova, Anthropic, ElevenLabs, LlamaIndex
+tags:
+  - mcp-in-action-track-consumer
+---
+# ALSARA - ALS Agentic Research Agent
+ALSARA (ALS Agentic Research Assistant) is an AI-powered research tool that intelligently orchestrates multiple biomedical databases to answer complex questions about ALS (Amyotrophic Lateral Sclerosis) research, treatments, and clinical trials in real-time.
+Built with a 4-phase agentic workflow (Planning → Executing → Reflecting → Synthesis), ALSARA searches PubMed, 559,000+ clinical trials via AACT database, and provides voice accessibility for ALS patients - delivering comprehensive research in 5-10 seconds.
+Built with Model Context Protocol (MCP), Gradio 6.x, and Anthropic Claude.
+## Key Features
+### Core Capabilities
+- **4-Phase Agentic Workflow**: Intelligent planning, parallel execution, reflection with gap-filling, and comprehensive synthesis
+- **Real-time Literature Search**: Query millions of PubMed peer-reviewed papers
+- **Clinical Trial Discovery**: Access 559,000+ trials from AACT PostgreSQL database (primary) with ClinicalTrials.gov fallback
+- **Voice Accessibility**: Text-to-speech using ElevenLabs for ALS patients with limited mobility
+- **Smart Caching**: Query normalization with 24-hour TTL for instant similar query responses
+- **Parallel Tool Execution**: 70% faster responses by running all searches simultaneously
+### Advanced Features
+- **Multi-Provider LLM Support**: Claude primary with SambaNova Llama 3.3 70B fallback
+- **Query Classification**: Smart routing between simple answers and complex research
+- **Rate Limiting**: 30 requests/minute per user with exponential backoff
+- **Memory Management**: Automatic conversation truncation and garbage collection
+- **Health Monitoring**: Uptime tracking, error rates, and tool usage statistics
+- **Citation Tracking**: All responses include PMIDs, DOIs, NCT IDs, and source references
+- **Web Scraping**: Fetch full-text articles with SSRF protection
+- **Export Conversations**: Download chat history as markdown files
+## Architecture
+The system uses a sophisticated multi-layer architecture:
+### 1. User Interface Layer
+- **Gradio 6.x** web application with chat interface
+- Real-time streaming responses
+- Voice output controls
+- Export and retry functionality
+### 2. Agentic Orchestration Layer
+**4-Phase Workflow:**
+1. **PLANNING**: Agent strategizes which databases to query
+2. **EXECUTING**: Parallel searches across all data sources
+3. **REFLECTING**: Evaluates results, identifies gaps, runs additional searches
+4. **SYNTHESIS**: Comprehensive answer with citations and confidence scoring
+### 3. LLM Provider Layer
+- **Primary**: Anthropic Claude (claude-sonnet-4-5-20250929)
+- **Fallback**: SambaNova Llama 3.3 70B (free alternative)
+- Smart routing based on query complexity
+### 4. MCP Server Layer
+Each server runs as a separate subprocess with JSON-RPC communication:
+- **aact-server**: Primary clinical trials database (559,000+ trials)
+- **pubmed-server**: PubMed literature search
+- **fetch-server**: Web scraping with security hardening
+- **elevenlabs-server**: Voice synthesis for accessibility
+- **clinicaltrials_links**: Fallback trial links when AACT unavailable
+- **llamaindex-server**: RAG/semantic search (optional)
+**Technical Note:** Uses custom MCP client (`custom_mcp_client.py`) to bypass SDK bugs with proper async/await handling, line-buffered I/O, and automatic retry logic.
+## Available Tools
+The agent has access to specialized tools across 6 MCP servers:
+### AACT Clinical Trials Database Tools (PRIMARY)
+#### 1. `aact__search_aact_trials`
+Search 559,000+ clinical trials from the AACT PostgreSQL database.
+**Parameters:**
+- `condition` (string, optional): Medical condition (default: "ALS")
+- `status` (string, optional): Trial status - "recruiting", "active", "completed", "all"
+- `intervention` (string, optional): Treatment/drug name
+- `sponsor` (string, optional): Trial sponsor organization
+- `phase` (string, optional): Trial phase (1, 2, 3, 4)
+- `max_results` (integer, optional): Maximum results (default: 10)
+**Returns:** Comprehensive trial data with NCT IDs, titles, status, phases, enrollment, and locations.
+#### 2. `aact__get_aact_trial`
+Get complete details for a specific clinical trial.
+**Parameters:**
+- `nct_id` (string, required): ClinicalTrials.gov NCT ID
+**Returns:** Full trial information including eligibility, outcomes, interventions, and contacts.
 ---
+### PubMed Literature Tools
+#### 3. `pubmed__search_pubmed`
+Search PubMed for peer-reviewed research papers.
+**Parameters:**
+- `query` (string, required): Search query (e.g., "ALS SOD1 therapy")
+- `max_results` (integer, optional): Maximum results (default: 10)
+- `sort` (string, optional): Sort by "relevance" or "date"
+**Returns:** Papers with titles, abstracts, PMIDs, authors, and publication dates.
+#### 4. `pubmed__get_paper_details`
+Get complete details for a specific PubMed paper.
+**Parameters:**
+- `pmid` (string, required): PubMed ID
+**Returns:** Full paper information including abstract, journal, DOI, and PubMed URL.
+---
+### Web Fetching Tools
+#### 5. `fetch__fetch_url`
+Fetch and extract content from web URLs with security hardening.
+**Parameters:**
+- `url` (string, required): URL to fetch
+- `extract_text_only` (boolean, optional): Extract only text content (default: true)
+**Returns:** Extracted webpage content with SSRF protection.
+---
+### Voice Accessibility Tools
+#### 6. `elevenlabs__text_to_speech`
+Convert research findings to audio for accessibility.
+**Parameters:**
+- `text` (string, required): Text to convert (max 2500 chars)
+- `voice_id` (string, optional): Voice selection (default: Rachel - medical-friendly)
+- `speed` (number, optional): Speech speed (0.5-2.0)
+**Returns:** Audio stream for playback.
+---
+### Fallback Tools
+#### 7. `clinicaltrials_links__get_known_als_trials`
+Returns curated list of important ALS trials when AACT is unavailable.
+#### 8. `clinicaltrials_links__get_search_link`
+Generates direct ClinicalTrials.gov search URLs.
+---
+### Tool Usage Notes
+- **Rate Limiting**: All tools respect API rate limits (PubMed: 3 req/sec)
+- **Caching**: Results cached for 24 hours with smart query normalization
+- **Connection Pooling**: AACT uses async PostgreSQL with 2-10 connections
+- **Timeout Protection**: 90-second timeout with automatic retry
+- **Security**: SSRF protection, input validation, content size limits
+## Quick Start
+### Prerequisites
+- Python 3.10+ (3.12 recommended)
+- Anthropic API key
+- Git
+### Installation
+1. Clone the repository
+```bash
+git clone https://github.com/yourusername/als-research-agent.git
+cd als-research-agent
+```
+2. Create virtual environment
+```bash
+python3.12 -m venv venv
+source venv/bin/activate  # On Windows: venv\Scripts\activate
+```
+3. Install dependencies
+```bash
+pip install -r requirements.txt
+```
+4. Set up environment variables
+Create a `.env` file:
+```bash
+# Required
+ANTHROPIC_API_KEY=sk-ant-xxx
+# Recommended
+ANTHROPIC_MODEL=claude-sonnet-4-5-20250929
+ELEVENLABS_API_KEY=xxx
+ELEVENLABS_VOICE_ID=21m00Tcm4TlvDq8ikWAM  # Rachel voice
+# Optional Features
+ENABLE_RAG=false             # Enable semantic search (requires setup)
+USE_FALLBACK_LLM=true        # Enable free SambaNova fallback
+DISABLE_CACHE=false          # Disable smart caching
+# Configuration
+GRADIO_SERVER_PORT=7860
+MAX_CONCURRENT_SEARCHES=3
+RATE_LIMIT_PUBMED_DELAY=1.0
+```
+5. Run the application
+```bash
+python als_agent_app.py
+```
+or
+```bash
+./venv/bin/python3.12 als_agent_app.py 2>&1
+```
+The app will launch at http://localhost:7860
+## Project Structure
+```
+als-research-agent/
+├── README.md
+├── requirements.txt
+├── .env.example
+├── als_agent_app.py             # Main Gradio application (1835 lines)
+├── custom_mcp_client.py         # Custom MCP client implementation
+├── llm_client.py                # Multi-provider LLM abstraction
+├── query_classifier.py          # Research vs simple query detection
+├── smart_cache.py               # Query normalization and caching
+├── refactored_helpers.py        # Streaming and tool execution
+├── parallel_tool_execution.py   # Concurrent search management
+├── servers/
+│   ├── aact_server.py           # AACT clinical trials database (PRIMARY)
+│   ├── pubmed_server.py         # PubMed literature search
+│   ├── fetch_server.py          # Web scraping with security
+│   ├── elevenlabs_server.py     # Voice synthesis
+│   ├── clinicaltrials_links.py  # Fallback trial links
+│   └── llamaindex_server.py     # RAG/semantic search (optional)
+├── shared/
+│   ├── __init__.py
+│   ├── config.py                # Centralized configuration
+│   ├── cache.py                 # TTL-based caching
+│   └── utils.py                 # Rate limiting and formatting
+└── tests/
+    ├── test_pubmed_server.py
+    ├── test_aact_server.py
+    ├── test_fetch_server.py
+    ├── test_elevenlabs.py
+    ├── test_integration.py
+    ├── test_llm_client.py
+    ├── test_performance.py
+    └── test_workflow_*.py
+```
+## Usage Examples
+### Example Queries
+**Complex Research Questions:**
+- "What are the latest gene therapy trials for SOD1 mutations with recent biomarker data?"
+- "Compare antisense oligonucleotide therapies in Phase 2 or 3 trials"
+- "Find recent PubMed papers on ALS protein aggregation from Japanese researchers"
+**Clinical Trial Discovery:**
+- "Active trials in Germany for bulbar onset ALS"
+- "Recruiting trials for ALS patients under 40 with slow progression"
+- "Phase 3 trials sponsored by Biogen or Ionis"
+**Treatment Information:**
+- "Compare efficacy of riluzole, edaravone, and AMX0035"
+- "What combination therapies showed promise in 2024?"
+- "Latest developments in stem cell therapy for ALS"
+**Accessibility Features:**
+- Click the voice icon to hear research summaries
+- Adjustable speech speed for comfort
+- Medical-friendly voice optimized for clarity
+## Performance Characteristics
+- **Typical Response Time**: 5-10 seconds for complex queries
+- **Parallel Speedup**: 70% faster than sequential searching
+- **Cache Hit Time**: <100ms for similar queries (24-hour TTL)
+- **Concurrent Handling**: 4 requests in ~8 seconds
+- **Tool Call Timeout**: 90 seconds with automatic retry
+- **Memory Limit**: 50 messages per conversation (~8-50KB per message)
+## Development
+### Running Tests
+```bash
+# All tests
+pytest tests/ -v
+# Unit tests only
+pytest tests/ -m "not integration"
+# With coverage
+pytest --cov=servers --cov-report=html
+# Quick tests
+./run_quick_tests.sh
+```
+### Adding New MCP Servers
+1. Create new server file in `servers/`
+2. Use FastMCP API to implement tools:
+```python
+from mcp.server.fastmcp import FastMCP
+mcp = FastMCP("my-server")
+@mcp.tool()
+async def my_tool(param: str) -> str:
+    """Tool description"""
+    return f"Result: {param}"
+if __name__ == "__main__":
+    mcp.run(transport="stdio")
+```
+3. Add server to `als_agent_app.py` in `setup_mcp_servers()`
+4. Write tests in `tests/`
+## Deployment
+### Hugging Face Spaces
+1. Create a Gradio Space
+2. Push your code
+3. Add secrets:
+   - `ANTHROPIC_API_KEY` (required)
+   - `ELEVENLABS_API_KEY` (for voice features)
+### Docker
+```bash
+docker build -t als-research-agent .
+docker run -p 7860:7860 \
+  -e ANTHROPIC_API_KEY=your_key \
+  -e ELEVENLABS_API_KEY=your_key \
+  als-research-agent
+```
+### Cloud Deployment (Azure/AWS/GCP)
+The application is containerized and ready for deployment on any cloud platform supporting Docker containers. See deployment guides for specific platforms.
+## Troubleshooting
+**MCP server not responding**
+- Check Python path and virtual environment activation
+- Verify all dependencies installed: `pip install -r requirements.txt`
+**Rate limit exceeded**
+- Add delays between requests
+- Check Anthropic API quota
+- Use `USE_FALLBACK_LLM=true` for free alternative
+**Voice synthesis not working**
+- Verify `ELEVENLABS_API_KEY` is set
+- Check API quota at ElevenLabs dashboard
+- Text may be too long (max 2500 chars)
+**AACT database connection issues**
+- Database may be under maintenance (Sunday 7 AM ET)
+- Fallback to `clinicaltrials_links` server activates automatically
+**Cache not working**
+- Check `DISABLE_CACHE` is not set to true
+- Verify `.cache/` directory has write permissions
+## Resources
+### ALS Research Organizations
+- ALS Association: https://www.als.org/
+- ALS Therapy Development Institute: https://www.als.net/
+- Answer ALS Data Portal: https://dataportal.answerals.org/
+- International Alliance of ALS/MND Associations: https://www.als-mnd.org/
+### Data Sources
+- PubMed E-utilities: https://www.ncbi.nlm.nih.gov/books/NBK25501/
+- AACT Database: https://aact.ctti-clinicaltrials.org/
+- ClinicalTrials.gov: https://clinicaltrials.gov/
+### Technologies
+- Model Context Protocol: https://modelcontextprotocol.io/
+- Gradio Documentation: https://www.gradio.app/docs/
+- Anthropic Claude: https://www.anthropic.com/
+- ElevenLabs API: https://elevenlabs.io/
+## Security & Privacy
+- **No Patient Data Storage**: Conversations are not permanently stored
+- **SSRF Protection**: Blocks access to private IPs and localhost
+- **Input Validation**: Injection pattern detection and length limits
+- **Rate Limiting**: Per-user request throttling
+- **API Key Security**: All keys stored as environment variables
+## License
+MIT License - See LICENSE file for details
+## Contributing
+1. Fork the repository
+2. Create a feature branch (`git checkout -b feature/amazing-feature`)
+3. Write tests for your changes
+4. Ensure all tests pass (`pytest`)
+5. Commit your changes (`git commit -m 'Add amazing feature'`)
+6. Push to the branch (`git push origin feature/amazing-feature`)
+7. Open a Pull Request
+## Future Enhancements
+### In Development
+- **NCBI Gene Database**: Gene information and mutations
+- **OMIM Integration**: Genetic disorder phenotypes
+- **Protein Data Bank**: 3D protein structures
+- **AlphaFold Database**: AI-predicted protein structures
+### Planned Features
+- **Voice Input**: Speech recognition for queries
+- **Patient Trial Matching**: Personalized eligibility assessment
+- **Research Trend Analysis**: Track emerging themes
+- **Alert System**: Notifications for new trials/papers
+- **Enhanced Export**: BibTeX, CSV, PDF formats
+- **Multi-language Support**: Global accessibility
+- **Drug Repurposing Module**: Identify potential ALS treatments
+- **arXiv Integration**: Computational biology papers
+## Acknowledgments
+Built for the global ALS research community to accelerate the path to a cure.
+Special thanks to:
+- The MCP team for the Model Context Protocol
+- Anthropic for Claude AI
+- The open-source community for invaluable contributions
+---
+**ALSARA - Accelerating ALS research, one query at a time.**
+For questions, issues, or contributions, please open an issue on GitHub.

als_agent_app.py ADDED Viewed

	@@ -0,0 +1,1832 @@

+# als_agent_app.py
+import gradio as gr
+import asyncio
+import json
+import os
+import logging
+from pathlib import Path
+from datetime import datetime, timedelta
+import sys
+import time
+from typing import Optional, List, Dict, Any, Tuple, AsyncGenerator, Union
+from collections import defaultdict
+from dotenv import load_dotenv
+import httpx
+import base64
+import tempfile
+import re
+# Load environment variables from .env file
+load_dotenv()
+# Add current directory to path for shared imports
+sys.path.insert(0, str(Path(__file__).parent))
+from shared import SimpleCache
+from custom_mcp_client import MCPClientManager
+from llm_client import UnifiedLLMClient
+from smart_cache import SmartCache, DEFAULT_PREWARM_QUERIES
+# Helper function imports for refactored code
+from refactored_helpers import (
+    stream_with_retry,
+    execute_tool_calls,
+    build_assistant_message
+)
+# Configure logging
+logging.basicConfig(
+    level=logging.INFO,
+    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
+    handlers=[
+        logging.StreamHandler(),
+        logging.FileHandler('app.log', mode='a', encoding='utf-8')
+    ]
+)
+logger = logging.getLogger(__name__)
+# Rate Limiter Class
+class RateLimiter:
+    """Rate limiter to prevent API overload"""
+    def __init__(self, max_requests_per_minute: int = 30):
+        self.max_requests_per_minute = max_requests_per_minute
+        self.request_times = defaultdict(list)
+    async def check_rate_limit(self, key: str = "default") -> bool:
+        """Check if request is within rate limit"""
+        now = datetime.now()
+        minute_ago = now - timedelta(minutes=1)
+        # Clean old requests
+        self.request_times[key] = [
+            t for t in self.request_times[key]
+            if t > minute_ago
+        ]
+        # Check if under limit
+        if len(self.request_times[key]) >= self.max_requests_per_minute:
+            return False
+        # Record this request
+        self.request_times[key].append(now)
+        return True
+    async def wait_if_needed(self, key: str = "default"):
+        """Wait if rate limit exceeded"""
+        while not await self.check_rate_limit(key):
+            await asyncio.sleep(2)  # Wait 2 seconds before retry
+# Initialize rate limiter
+rate_limiter = RateLimiter(max_requests_per_minute=30)
+# Memory management settings
+MAX_CONVERSATION_LENGTH = 50  # Maximum messages to keep in history
+MEMORY_CLEANUP_INTERVAL = 300  # Cleanup every 5 minutes
+async def cleanup_memory():
+    """Periodic memory cleanup task"""
+    while True:
+        try:
+            # Clean up expired cache entries
+            tool_cache.cleanup_expired()
+            smart_cache.cleanup() if smart_cache else None
+            # Force garbage collection for large cleanups
+            import gc
+            collected = gc.collect()
+            if collected > 0:
+                logger.debug(f"Memory cleanup: collected {collected} objects")
+        except Exception as e:
+            logger.error(f"Error during memory cleanup: {e}")
+        await asyncio.sleep(MEMORY_CLEANUP_INTERVAL)
+# Start memory cleanup task
+cleanup_task = None
+# Track whether last response used research workflow (for voice button)
+last_response_was_research = False
+# Health monitoring
+class HealthMonitor:
+    """Monitor system health and performance"""
+    def __init__(self):
+        self.start_time = datetime.now()
+        self.request_count = 0
+        self.error_count = 0
+        self.tool_call_count = defaultdict(int)
+        self.response_times = []
+        self.last_error = None
+    def record_request(self):
+        self.request_count += 1
+    def record_error(self, error: str):
+        self.error_count += 1
+        self.last_error = {"time": datetime.now(), "error": str(error)[:500]}
+    def record_tool_call(self, tool_name: str):
+        self.tool_call_count[tool_name] += 1
+    def record_response_time(self, duration: float):
+        self.response_times.append(duration)
+        # Keep only last 100 response times to avoid memory buildup
+        if len(self.response_times) > 100:
+            self.response_times = self.response_times[-100:]
+    def get_health_status(self) -> Dict[str, Any]:
+        """Get current health status"""
+        uptime = (datetime.now() - self.start_time).total_seconds()
+        avg_response_time = sum(self.response_times) / len(self.response_times) if self.response_times else 0
+        return {
+            "status": "healthy" if self.error_count < 10 else "degraded",
+            "uptime_seconds": uptime,
+            "request_count": self.request_count,
+            "error_count": self.error_count,
+            "error_rate": self.error_count / max(1, self.request_count),
+            "avg_response_time": avg_response_time,
+            "cache_size": tool_cache.size(),
+            "rate_limit_status": f"{len(rate_limiter.request_times)} active keys",
+            "most_used_tools": dict(sorted(self.tool_call_count.items(), key=lambda x: x[1], reverse=True)[:5]),
+            "last_error": self.last_error
+        }
+# Initialize health monitor
+health_monitor = HealthMonitor()
+# Error message formatter
+def format_error_message(error: Exception, context: str = "") -> str:
+    """Format error messages with helpful suggestions"""
+    error_str = str(error)
+    error_type = type(error).__name__
+    # Common error patterns and suggestions
+    if "timeout" in error_str.lower():
+        suggestion = """
+**Suggestions:**
+- Try simplifying your search query
+- Break complex questions into smaller parts
+- Check your internet connection
+- The service may be temporarily overloaded - try again in a moment
+        """
+    elif "rate limit" in error_str.lower():
+        suggestion = """
+**Suggestions:**
+- Wait a moment before trying again
+- Reduce the number of simultaneous searches
+- Consider using cached results when available
+        """
+    elif "connection" in error_str.lower() or "network" in error_str.lower():
+        suggestion = """
+**Suggestions:**
+- Check your internet connection
+- The external service may be temporarily unavailable
+- Try again in a few moments
+        """
+    elif "invalid" in error_str.lower() or "validation" in error_str.lower():
+        suggestion = """
+**Suggestions:**
+- Check your query for special characters or formatting issues
+- Ensure your question is clear and well-formed
+- Avoid using HTML or script tags in your query
+        """
+    elif "memory" in error_str.lower() or "resource" in error_str.lower():
+        suggestion = """
+**Suggestions:**
+- The system may be under heavy load
+- Try a simpler query
+- Clear your browser cache and refresh the page
+        """
+    else:
+        suggestion = """
+**Suggestions:**
+- Try rephrasing your question
+- Break complex queries into simpler parts
+- If the error persists, please report it to support
+        """
+    formatted = f"""
+❌ **Error Encountered**
+**Type:** {error_type}
+**Details:** {error_str[:500]}
+{f"**Context:** {context}" if context else ""}
+{suggestion}
+**Need Help?**
+- Try the example queries in the sidebar
+- Check the System Health tab for service status
+- Report persistent issues on GitHub
+    """
+    return formatted.strip()
+# Initialize the unified LLM client
+# All provider logic is now handled inside UnifiedLLMClient
+client = None  # Initialize to None for proper cleanup handling
+try:
+    client = UnifiedLLMClient()
+    logger.info(f"LLM client initialized: {client.get_provider_display_name()}")
+except ValueError as e:
+    # Re-raise configuration errors with clear instructions
+    logger.error(f"LLM configuration error: {e}")
+    raise
+# Global MCP client manager
+mcp_manager = MCPClientManager()
+# Internal thinking tags are always filtered for cleaner output
+# Model configuration
+# Use Claude 3.5 Sonnet with correct model ID that works with the API key
+ANTHROPIC_MODEL = os.getenv("ANTHROPIC_MODEL", "claude-sonnet-4-5-20250929")
+logger.info(f"Using model: {ANTHROPIC_MODEL}")
+# Configuration for max tokens in responses
+# Set MAX_RESPONSE_TOKENS in .env to control response length
+# Claude 3.5 Sonnet supports up to 8192 tokens
+MAX_RESPONSE_TOKENS = min(int(os.getenv("MAX_RESPONSE_TOKENS", "8192")), 8192)
+logger.info(f"Max response tokens set to: {MAX_RESPONSE_TOKENS}")
+# Global smart cache (24 hour TTL for research queries)
+smart_cache = SmartCache(cache_dir=".cache", ttl_hours=24)
+# Keep tool cache for MCP tool results
+tool_cache = SimpleCache(ttl=3600)
+# Cache for tool definitions to avoid repeated fetching
+_cached_tools = None
+_tools_cache_time = None
+TOOLS_CACHE_TTL = 86400  # 24 hour cache for tool definitions (tools rarely change)
+async def setup_mcp_servers() -> MCPClientManager:
+    """Initialize all MCP servers using custom client"""
+    logger.info("Setting up MCP servers...")
+    # Get the directory where this script is located
+    script_dir = Path(__file__).parent.resolve()
+    servers_dir = script_dir / "servers"
+    logger.info(f"Script directory: {script_dir}")
+    logger.info(f"Servers directory: {servers_dir}")
+    # Verify servers directory exists
+    if not servers_dir.exists():
+        logger.error(f"Servers directory not found: {servers_dir}")
+        raise FileNotFoundError(f"Servers directory not found: {servers_dir}")
+    # Add all servers to manager
+    servers = {
+        "pubmed": servers_dir / "pubmed_server.py",
+        "aact": servers_dir / "aact_server.py",  # PRIMARY: AACT database for comprehensive clinical trials data
+        "trials_links": servers_dir / "clinicaltrials_links.py",  # FALLBACK: Direct links and known ALS trials
+        "fetch": servers_dir / "fetch_server.py",
+        "elevenlabs": servers_dir / "elevenlabs_server.py",  # Voice capabilities for accessibility
+    }
+    # bioRxiv temporarily disabled - commenting out to hide from users
+    # enable_biorxiv = os.getenv("ENABLE_BIORXIV", "true").lower() == "true"
+    # if enable_biorxiv:
+    #     servers["biorxiv"] = servers_dir / "biorxiv_server.py"
+    # else:
+    #     logger.info("⚠️ bioRxiv/medRxiv disabled for faster searches (set ENABLE_BIORXIV=true to enable)")
+    # Conditionally add LlamaIndex RAG based on environment variable
+    enable_rag = os.getenv("ENABLE_RAG", "false").lower() == "true"
+    if enable_rag:
+        logger.info("📚 RAG/LlamaIndex enabled (will add ~10s to startup for semantic search)")
+        servers["llamaindex"] = servers_dir / "llamaindex_server.py"
+    else:
+        logger.info("🚀 RAG/LlamaIndex disabled for faster startup (set ENABLE_RAG=true to enable)")
+    # Parallelize server initialization for faster startup
+    async def init_server(name: str, script_path: Path):
+        try:
+            await mcp_manager.add_server(name, str(script_path))
+            logger.info(f"✓ MCP server {name} initialized")
+        except Exception as e:
+            logger.error(f"Failed to initialize MCP server {name}: {e}")
+            raise
+    # Start all servers concurrently
+    tasks = [init_server(name, script_path) for name, script_path in servers.items()]
+    results = await asyncio.gather(*tasks, return_exceptions=True)
+    # Check for any failures
+    for i, result in enumerate(results):
+        if isinstance(result, Exception):
+            name = list(servers.keys())[i]
+            logger.error(f"Failed to initialize MCP server {name}: {result}")
+            raise result
+    logger.info("All MCP servers initialized successfully")
+    return mcp_manager
+async def cleanup_mcp_servers() -> None:
+    """Cleanup MCP server sessions"""
+    logger.info("Cleaning up MCP server sessions...")
+    await mcp_manager.close_all()
+    logger.info("MCP cleanup complete")
+def export_conversation(history: Optional[List[Any]]) -> Optional[Path]:
+    """Export conversation to markdown format"""
+    if not history:
+        return None
+    timestamp = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
+    filename = f"als_conversation_{timestamp}.md"
+    content = f"""# ALS Research Conversation
+**Exported:** {datetime.now().strftime("%Y-%m-%d %H:%M:%S")}
+---
+"""
+    for i, (user_msg, assistant_msg) in enumerate(history, 1):
+        content += f"## Query {i}\n\n**User:** {user_msg}\n\n**Assistant:**\n{assistant_msg}\n\n---\n\n"
+    content += f"""
+*Generated by ALSARA - ALS Agentic Research Agent*
+*Total interactions: {len(history)}*
+"""
+    filepath = Path(filename)
+    filepath.write_text(content, encoding='utf-8')
+    logger.info(f"Exported conversation to {filename}")
+    return filepath
+async def get_all_tools() -> List[Dict[str, Any]]:
+    """Retrieve all available tools from MCP servers with caching"""
+    global _cached_tools, _tools_cache_time
+    # Check if cache is valid
+    if _cached_tools and _tools_cache_time:
+        if time.time() - _tools_cache_time < TOOLS_CACHE_TTL:
+            logger.debug("Using cached tool definitions")
+            return _cached_tools
+    # Fetch fresh tool definitions
+    logger.info("Fetching fresh tool definitions from MCP servers")
+    all_tools = []
+    # Get tools from all servers
+    server_tools = await mcp_manager.list_all_tools()
+    for server_name, tools in server_tools.items():
+        for tool in tools:
+            # Convert MCP tool to Anthropic function format
+            all_tools.append({
+                "name": f"{server_name}__{tool['name']}",
+                "description": tool.get('description', ''),
+                "input_schema": tool.get('inputSchema', {})
+            })
+    # Update cache
+    _cached_tools = all_tools
+    _tools_cache_time = time.time()
+    logger.info(f"Cached {len(all_tools)} tool definitions")
+    return all_tools
+async def call_mcp_tool(tool_name: str, arguments: Dict[str, Any], max_retries: int = 3) -> str:
+    """Execute an MCP tool call with caching, rate limiting, retry logic, and error handling"""
+    # Check cache first (no retries needed for cached results)
+    cached_result = tool_cache.get(tool_name, arguments)
+    if cached_result:
+        return cached_result
+    last_error = None
+    for attempt in range(max_retries):
+        try:
+            # Apply rate limiting
+            await rate_limiter.wait_if_needed(tool_name.split("__")[0])
+            # Parse tool name
+            if "__" not in tool_name:
+                logger.error(f"Invalid tool name format: {tool_name}")
+                return f"Error: Invalid tool name format: {tool_name}"
+            server_name, tool_method = tool_name.split("__", 1)
+            if attempt > 0:
+                logger.info(f"Retry {attempt}/{max_retries} for tool: {tool_method} on server: {server_name}")
+            else:
+                logger.info(f"Calling tool: {tool_method} on server: {server_name}")
+            # Call tool with timeout using custom client
+            result = await asyncio.wait_for(
+                mcp_manager.call_tool(server_name, tool_method, arguments),
+                timeout=90.0  # 90 second timeout for complex tool calls (BioRxiv searches can be slow)
+            )
+            # Result is already a string from custom client
+            final_result = result if result else "No content returned from tool"
+            # Cache the result
+            tool_cache.set(tool_name, arguments, final_result)
+            # Record successful tool call
+            health_monitor.record_tool_call(tool_name)
+            return final_result
+        except asyncio.TimeoutError as e:
+            last_error = e
+            logger.warning(f"Tool call timed out (attempt {attempt + 1}/{max_retries}): {tool_name}")
+            if attempt < max_retries - 1:
+                await asyncio.sleep(2 ** attempt)  # Exponential backoff: 1s, 2s, 4s
+                continue
+            # Last attempt failed
+            timeout_error = TimeoutError(f"Tool timeout after {max_retries} attempts - the {server_name} server may be overloaded")
+            return format_error_message(timeout_error, context=f"Calling {tool_name}")
+        except ValueError as e:
+            logger.error(f"Invalid tool/server: {tool_name} - {e}")
+            return format_error_message(e, context=f"Invalid tool: {tool_name}")
+        except Exception as e:
+            last_error = e
+            logger.warning(f"Error calling tool {tool_name} (attempt {attempt + 1}/{max_retries}): {e}")
+            if attempt < max_retries - 1:
+                await asyncio.sleep(2 ** attempt)  # Exponential backoff
+                continue
+            # Last attempt failed
+            return format_error_message(e, context=f"Tool {tool_name} failed after {max_retries} attempts")
+    # Should not reach here, but handle just in case
+    if last_error:
+        return f"Tool failed after {max_retries} attempts: {str(last_error)[:200]}"
+    return "Unexpected error in tool execution"
+def filter_internal_tags(text: str) -> str:
+    """Remove all internal processing tags from the output."""
+    import re
+    # Remove internal tags and their content with single regex
+    text = re.sub(r'<(thinking|search_quality_reflection|search_quality_score)>.*?</\1>|<(thinking|search_quality_reflection|search_quality_score)>.*$', '', text, flags=re.DOTALL)
+    # Remove wrapper tags but keep content
+    text = re.sub(r'</?(result|answer)>', '', text)
+    # Fix phase formatting - ensure consistent formatting
+    # Add proper line breaks around phase headers
+    # First normalize any existing phase markers to be on their own line
+    phase_patterns = [
+        # Fix incorrect formats (missing asterisks) first
+        (r'(?<!\*)🎯\s*PLANNING:(?!\*)', r'**🎯 PLANNING:**'),
+        (r'(?<!\*)🔧\s*EXECUTING:(?!\*)', r'**🔧 EXECUTING:**'),
+        (r'(?<!\*)🤔\s*REFLECTING:(?!\*)', r'**🤔 REFLECTING:**'),
+        (r'(?<!\*)✅\s*SYNTHESIS:(?!\*)', r'**✅ SYNTHESIS:**'),
+        # Then ensure the markers are on new lines (if not already)
+        (r'(?<!\n)(\*\*🎯\s*PLANNING:\*\*)', r'\n\n\1'),
+        (r'(?<!\n)(\*\*🔧\s*EXECUTING:\*\*)', r'\n\n\1'),
+        (r'(?<!\n)(\*\*🤔\s*REFLECTING:\*\*)', r'\n\n\1'),
+        (r'(?<!\n)(\*\*✅\s*SYNTHESIS:\*\*)', r'\n\n\1'),
+        # Then add spacing after them
+        (r'(\*\*🎯\s*PLANNING:\*\*)', r'\1\n'),
+        (r'(\*\*🔧\s*EXECUTING:\*\*)', r'\1\n'),
+        (r'(\*\*🤔\s*REFLECTING:\*\*)', r'\1\n'),
+        (r'(\*\*✅\s*SYNTHESIS:\*\*)', r'\1\n'),
+    ]
+    for pattern, replacement in phase_patterns:
+        text = re.sub(pattern, replacement, text)
+    # Clean up excessive whitespace while preserving intentional formatting
+    text = re.sub(r'[ \t]+', ' ', text)  # Multiple spaces to single space
+    text = re.sub(r'\n{4,}', '\n\n\n', text)  # Maximum 3 newlines
+    text = re.sub(r'^\n+', '', text)  # Remove leading newlines
+    text = re.sub(r'\n+$', '\n', text)  # Single trailing newline
+    return text.strip()
+def is_complex_query(message: str) -> bool:
+    """Detect complex queries that might need more iterations"""
+    complex_indicators = [
+        "genotyping", "genetic testing", "multiple", "comprehensive",
+        "all", "compare", "versus", "difference between", "systematic",
+        "gene-targeted", "gene targeted", "list the main", "what are all",
+        "complete overview", "detailed analysis", "in-depth"
+    ]
+    return any(indicator in message.lower() for indicator in complex_indicators)
+def validate_query(message: str) -> Tuple[bool, str]:
+    """Validate and sanitize user input to prevent injection and abuse"""
+    # Check length
+    if not message or not message.strip():
+        return False, "Please enter a query"
+    if len(message) > 2000:
+        return False, "Query too long (maximum 2000 characters). Please shorten your question."
+    # Check for potential injection patterns
+    suspicious_patterns = [
+        r'<script', r'javascript:', r'onclick', r'onerror',
+        r'\bignore\s+previous\s+instructions\b',
+        r'\bsystem\s+prompt\b',
+        r'\bforget\s+everything\b',
+        r'\bdisregard\s+all\b'
+    ]
+    for pattern in suspicious_patterns:
+        if re.search(pattern, message, re.IGNORECASE):
+            logger.warning(f"Suspicious pattern detected in query: {pattern}")
+            return False, "Invalid query format. Please rephrase your question."
+    # Check for excessive repetition (potential spam)
+    words = message.lower().split()
+    if len(words) > 10:
+        # Check if any word appears too frequently
+        word_freq = {}
+        for word in words:
+            word_freq[word] = word_freq.get(word, 0) + 1
+        max_freq = max(word_freq.values())
+        if max_freq > len(words) * 0.5:  # If any word is more than 50% of the query
+            return False, "Query appears to contain excessive repetition. Please rephrase."
+    return True, ""
+async def als_research_agent(message: str, history: Optional[List[Dict[str, Any]]]) -> AsyncGenerator[str, None]:
+    """Main agent logic with streaming response and error handling"""
+    global last_response_was_research
+    start_time = time.time()
+    health_monitor.record_request()
+    try:
+        # Validate input first
+        valid, error_msg = validate_query(message)
+        if not valid:
+            yield f"⚠️ **Input Validation Error:** {error_msg}"
+            return
+        logger.info(f"Received valid query: {message[:100]}...")  # Log first 100 chars
+        # Truncate history to prevent memory bloat
+        if history and len(history) > MAX_CONVERSATION_LENGTH:
+            logger.info(f"Truncating conversation history from {len(history)} to {MAX_CONVERSATION_LENGTH} messages")
+            history = history[-MAX_CONVERSATION_LENGTH:]
+        # System prompt
+        base_prompt = """You are ALSARA, an expert ALS (Amyotrophic Lateral Sclerosis) research assistant with agentic capabilities for planning, execution, and reflection.
+CRITICAL CONTEXT: ALL queries should be interpreted in the context of ALS research unless explicitly stated otherwise.
+MANDATORY SEARCH QUERY RULES:
+1. ALWAYS include "ALS" or "amyotrophic lateral sclerosis" in EVERY search query
+2. If the user's query doesn't mention ALS, ADD IT to your search terms
+3. This prevents irrelevant results from other conditions
+Examples:
+- User: "genotyping for gene targeted treatments" → Search: "genotyping ALS gene targeted treatments"
+- User: "psilocybin clinical trials" → Search: "psilocybin ALS clinical trials"
+- User: "stem cell therapy" → Search: "stem cell therapy ALS"
+- User: "gene therapy trials" → Search: "gene therapy ALS trials"
+Your capabilities:
+- Search PubMed for peer-reviewed research papers
+- Find active clinical trials in the AACT database"""
+        # Add RAG capability only if enabled
+        enable_rag = os.getenv("ENABLE_RAG", "false").lower() == "true"
+        if enable_rag:
+            base_prompt += """
+- **Semantic search using RAG**: Instantly search cached ALS research papers using AI-powered semantic matching"""
+        base_prompt += """
+- Fetch and analyze web content
+- Synthesize information from multiple sources
+- Provide citations with PMIDs, DOIs, and NCT IDs
+=== AGENTIC WORKFLOW (REQUIRED) ===
+You MUST follow ALL FOUR phases for EVERY query - no exceptions:
+1. **🎯 PLANNING PHASE** (MANDATORY - ALWAYS FIRST):
+   Before using any tools, you MUST explicitly outline your search strategy:"""
+        if enable_rag:
+            base_prompt += """
+   - FIRST check semantic cache using RAG for instant results from indexed papers"""
+        base_prompt += """
+   - State what databases you will search and in what order
+   - ALWAYS plan to search PubMed for peer-reviewed research
+   - For clinical questions, also include AACT trials database
+   - Identify key search terms and variations
+   - Explain your prioritization approach
+   - Format: MUST start on a NEW LINE with "**🎯 PLANNING:**" followed by your strategy
+2. **🔧 EXECUTION PHASE** (MANDATORY - AFTER PLANNING):
+   - MUST mark this phase on a NEW LINE with "**🔧 EXECUTING:**"
+   - Execute your planned searches systematically"""
+        if enable_rag:
+            base_prompt += """
+   - START with semantic search using RAG for instant cached results"""
+        base_prompt += """
+   - MINIMUM requirement: Search PubMed for peer-reviewed literature
+   - For clinical questions, search AACT trials database
+   - Gather initial results from each source
+   - Show tool calls and results
+   - This phase is for INITIAL searches only (as planned)
+3. **🤔 REFLECTION PHASE** (MANDATORY - AFTER EXECUTION):
+   After tool execution, you MUST ALWAYS reflect before synthesizing:
+   CRITICAL FORMAT REQUIREMENTS:
+   - MUST be EXACTLY: **🤔 REFLECTING:**
+   - MUST include the asterisks (**) for bold formatting
+   - MUST start on a NEW LINE (never inline with other text)
+   - WRONG: "🤔 REFLECTING:" (missing asterisks)
+   - WRONG: "search completed🤔 REFLECTING:" (inline, not on new line)
+   - CORRECT: New line, then **🤔 REFLECTING:**
+   Content requirements:
+   - Evaluate: "Do I have sufficient information to answer comprehensively?"
+   - Identify gaps: "What aspects of the query remain unaddressed?"
+   - Decide: "Should I refine my search or proceed to synthesis?"
+   CRITICAL: If you need more searches:
+   - DO NOT start a new PLANNING phase
+   - DO NOT write new phase markers
+   - Stay WITHIN the REFLECTION phase
+   - Simply continue searching and analyzing while in REFLECTING mode
+   - Additional searches are part of reflection, not a new workflow
+   - NEVER skip this phase - it ensures answer quality
+4. **✅ SYNTHESIS PHASE** (MANDATORY - FINAL PHASE):
+   - MUST start on a NEW LINE with "**✅ SYNTHESIS:**"
+   - Provide comprehensive synthesis of all findings
+   - Include all citations with URLs
+   - Summarize key insights
+   - **CONFIDENCE SCORING**: Include confidence level for key claims:
+     • High confidence (🟢): Multiple peer-reviewed studies or systematic reviews
+     • Moderate confidence (🟡): Limited studies or preprints with consistent findings
+     • Low confidence (🔴): Single study, conflicting evidence, or theoretical basis
+   - This phase MUST appear in EVERY response
+FORMATTING RULES:
+- Each phase marker MUST appear on its own line
+- Never put phase markers inline with other text
+- Always use the exact format: **[emoji] PHASE_NAME:**
+- MUST include asterisks for bold: **🤔 REFLECTING:** not just 🤔 REFLECTING:
+- Each phase should appear EXACTLY ONCE per response - never repeat the workflow
+CRITICAL WORKFLOW RULES:
+- You MUST include ALL FOUR phases in your response
+- Each phase appears EXACTLY ONCE (never repeat Planning→Executing→Reflecting→Synthesis)
+- Missing any phase is unacceptable
+- Duplicating phases is unacceptable
+- The workflow is a SINGLE CYCLE:
+  1. PLANNING (once at start)
+  2. EXECUTING (initial searches)
+  3. REFLECTING (evaluate AND do additional searches if needed - all within this phase)
+  4. SYNTHESIS (final answer)
+- NEVER restart the workflow - additional searches happen WITHIN reflection
+CRITICAL SYNTHESIS RULES:
+- You MUST ALWAYS end with a ✅ SYNTHESIS phase
+- If searches fail, state "Despite search limitations..." and provide knowledge-based answer
+- If you reach iteration limits, immediately provide synthesis
+- NEVER end without synthesis - this is a MANDATORY requirement
+- If uncertain, start synthesis with: "Based on available information..."
+SYNTHESIS MUST INCLUDE:
+1. Direct answer to the user's question
+2. Key findings from successful searches (if any)
+3. Citations with clickable URLs
+4. If searches failed: explanation + knowledge-based answer
+5. Suggested follow-up questions or alternative approaches
+=== SELF-CORRECTION BEHAVIOR ===
+If your searches return zero or insufficient results:
+- Try broader search terms (remove qualifiers)
+- Try alternative terminology or synonyms
+- Search for related concepts
+- Explicitly state what you tried and what you found
+When answering:
+1. Be concise in explanations while maintaining clarity
+2. Focus on presenting search results efficiently
+3. Always cite sources with specific identifiers AND URLs:
+   - PubMed: Include PMID and URL (https://pubmed.ncbi.nlm.nih.gov/PMID/)
+   - Preprints: Include DOI and URL (https://doi.org/DOI)
+   - Clinical Trials: Include NCT ID and URL (https://clinicaltrials.gov/study/NCTID)
+4. Use numbered citations [1], [2] with a references section at the end
+5. Prioritize recent research (2023-2025)
+6. When discussing preprints, note they are NOT peer-reviewed
+7. Explain complex concepts clearly
+8. Acknowledge uncertainty when appropriate
+9. Suggest related follow-up questions
+CRITICAL CITATION RULES:
+- ONLY cite papers, preprints, and trials that you have ACTUALLY found using the search tools
+- NEVER make up or invent citations, PMIDs, DOIs, or NCT IDs
+- NEVER cite papers from your training data unless you have verified them through search
+- If you cannot find specific research on a topic, explicitly state "No studies found" rather than inventing citations
+- Every citation must come from actual search results obtained through the available tools
+- If asked about a topic you know from training but haven't searched, you MUST search first before citing
+IMPORTANT: When referencing papers in your final answer, ALWAYS include clickable URLs alongside citations to make it easy for users to access the sources.
+Available tools:
+- pubmed__search_pubmed: Search peer-reviewed research literature
+- pubmed__get_paper_details: Get full paper details from PubMed (USE SPARINGLY - only for most relevant papers)
+# - biorxiv__search_preprints: (temporarily unavailable)
+# - biorxiv__get_preprint_details: (temporarily unavailable)
+- aact__search_aact_trials: Search clinical trials (PRIMARY - use this first)
+- aact__get_aact_trial: Get specific trial details from AACT database
+- trials_links__get_known_als_trials: Get curated list of important ALS trials (FALLBACK)
+- trials_links__get_search_link: Generate direct ClinicalTrials.gov search URLs
+- fetch__fetch_url: Retrieve web content
+PERFORMANCE OPTIMIZATION:
+- Search results already contain abstracts - use these for initial synthesis
+- Only fetch full details for papers that are DIRECTLY relevant to the query
+- Limit detail fetches to 5-7 most relevant items per database
+- Prioritize based on: recency, relevance to query, impact/importance
+Search strategy:
+1. Search all relevant databases (PubMed, AACT clinical trials)
+2. ALWAYS supplement with web fetching to:
+   - Find additional information not in databases
+   - Access sponsor/institution websites
+   - Get recent news and updates
+   - Retrieve full-text content when needed
+   - Verify and expand on database results
+3. Synthesize all sources for comprehensive answers
+For clinical trials - NEW ARCHITECTURE:
+PRIMARY SOURCE - AACT Database:
+- Use search_aact_trials FIRST - provides comprehensive clinical trials data from AACT database
+- 559,000+ trials available with no rate limits
+- Use uppercase status values: RECRUITING, ACTIVE_NOT_RECRUITING, NOT_YET_RECRUITING, COMPLETED
+- For ALS searches, the condition "ALS" will automatically match related terms
+FALLBACK - Links Server (when AACT unavailable):
+- Use get_known_als_trials for curated list of 8 important ALS trials
+- Use get_search_link to generate search URLs for clinical trials
+- Use get_trial_link to generate direct links to specific trials
+ADDITIONAL SOURCES:
+- If specific NCT IDs are mentioned, can also use fetch__fetch_url with:
+  https://clinicaltrials.gov/study/{NCT_ID}
+- Search sponsor websites, medical news, and university pages for updates
+ARCHITECTURE FLOW:
+User Query → AACT Database (Primary)
+                ↓
+         If AACT unavailable
+                ↓
+         Links Server (Fallback)
+                ↓
+         Direct links to trial websites
+Note: Direct API access is unavailable - using AACT database instead
+"""
+        # Add enhanced instructions for Llama models to improve thoroughness
+        if client.is_using_llama_primary():
+            llama_enhancement = """
+ENHANCED SEARCH REQUIREMENTS FOR COMPREHENSIVE RESULTS:
+You MUST follow this structured approach for EVERY research query:
+=== MANDATORY SEARCH PHASES ===
+Phase 1 - Comprehensive Database Search (ALL databases REQUIRED):
+□ Search PubMed with multiple keyword variations
+□ Search AACT database for clinical trials
+□ Use at least 3-5 different search queries per database
+Phase 2 - Strategic Detail Fetching (BE SELECTIVE):
+□ Get paper details for the TOP 5-7 most relevant PubMed results
+□ Get trial details for the TOP 3-4 most relevant clinical trials
+□ ONLY fetch details for papers that are DIRECTLY relevant to the query
+□ Use search result abstracts to prioritize which papers need full details
+Phase 3 - Synthesis Requirements:
+□ Include ALL relevant papers found (not just top 3-5)
+□ Organize by subtopic or treatment approach
+□ Provide complete citations with URLs
+MINIMUM SEARCH STANDARDS:
+- For general queries: At least 10-15 total searches across all databases
+- For specific treatments: At least 5-7 searches per database
+- For comprehensive reviews: At least 15-20 total searches
+- NEVER stop after finding just 2-3 results
+EXAMPLE SEARCH PATTERN for "gene therapy ALS":
+1. pubmed__search_pubmed: "gene therapy ALS"
+2. pubmed__search_pubmed: "AAV ALS treatment"
+3. pubmed__search_pubmed: "SOD1 gene therapy"
+4. pubmed__search_pubmed: "C9orf72 gene therapy"
+5. pubmed__search_pubmed: "viral vector ALS"
+# 6. biorxiv__search_preprints: (temporarily unavailable)
+# 7. biorxiv__search_preprints: (temporarily unavailable)
+6. aact__search_aact_trials: condition="ALS", intervention="gene therapy"
+7. aact__search_aact_trials: condition="ALS", intervention="AAV"
+10. [Get details for ALL results found]
+11. [Web fetch for recent developments]
+CRITICAL: Thoroughness is MORE important than speed. Users expect comprehensive results."""
+            system_prompt = base_prompt + llama_enhancement
+            logger.info("Using enhanced prompting for Llama model to improve search thoroughness")
+        else:
+            # Use base prompt directly for Claude
+            system_prompt = base_prompt
+        # Import query classifier
+        from query_classifier import QueryClassifier
+        # Classify the query to determine processing mode
+        classification = QueryClassifier.classify_query(message)
+        processing_hint = QueryClassifier.get_processing_hint(classification)
+        logger.info(f"Query classification: {classification}")
+        # Check smart cache for similar queries first
+        cached_result = smart_cache.find_similar_cached(message)
+        if cached_result:
+            logger.info(f"Smart cache hit for query: {message[:50]}...")
+            yield "🎯 **Using cached result** (similar query found)\n\n"
+            yield cached_result
+            return
+        # Check if this is a high-frequency query with special config
+        high_freq_config = smart_cache.get_high_frequency_config(message)
+        if high_freq_config:
+            logger.info(f"High-frequency query detected with config: {high_freq_config}")
+            # Note: We could use optimized search terms or Claude here
+            # For now, just log it and continue with normal processing
+        # Get available tools
+        tools = await get_all_tools()
+        # Check if this is a simple query that doesn't need research
+        if not classification['requires_research']:
+            # Simple query - skip the full research workflow
+            logger.info(f"Simple query detected - using direct response mode: {classification['reason']}")
+            # Mark that this response won't use research workflow (disable voice button)
+            global last_response_was_research
+            last_response_was_research = False
+            # Use a simplified prompt for non-research queries
+            simple_prompt = """You are an AI assistant for ALS research questions.
+For this query, provide a helpful, conversational response without using research tools.
+Keep your response friendly and informative."""
+            # For simple queries, just make one API call without tools
+            messages = [
+                {"role": "system", "content": simple_prompt},
+                {"role": "user", "content": message}
+            ]
+            # Display processing hint
+            yield f"{processing_hint}\n\n"
+            # Single API call for simple response (no tools)
+            async for response_text, tool_calls, provider_used in stream_with_retry(
+                client=client,
+                messages=messages,
+                tools=None,  # No tools for simple queries
+                system_prompt=simple_prompt,
+                max_retries=2,
+                model=ANTHROPIC_MODEL,
+                max_tokens=2000,  # Shorter responses for simple queries
+                stream_name="simple response"
+            ):
+                yield response_text
+            # Return early - skip all the research phases
+            return
+        # Research query - use full workflow with tools
+        logger.info(f"Research query detected - using full workflow: {classification['reason']}")
+        # Mark that this response will use research workflow (enable voice button)
+        last_response_was_research = True
+        yield f"{processing_hint}\n\n"
+        # Build messages for research workflow
+        messages = [
+            {"role": "system", "content": system_prompt}
+        ]
+        # Add history (remove Gradio metadata)
+        if history:
+            # Only keep 'role' and 'content' fields from messages
+            for msg in history:
+                if isinstance(msg, dict):
+                    messages.append({
+                        "role": msg.get("role"),
+                        "content": msg.get("content")
+                    })
+                else:
+                    messages.append(msg)
+        # Add current message
+        messages.append({"role": "user", "content": message})
+        # Initial API call with streaming using helper function
+        full_response = ""
+        tool_calls = []
+        # Use the stream_with_retry helper to handle all retry logic
+        provider_used = "Anthropic Claude"  # Track which provider
+        async for response_text, current_tool_calls, provider_used in stream_with_retry(
+            client=client,
+            messages=messages,
+            tools=tools,
+            system_prompt=system_prompt,
+            max_retries=2,  # Increased from 0 to allow retries
+            model=ANTHROPIC_MODEL,
+            max_tokens=MAX_RESPONSE_TOKENS,
+            stream_name="initial API call"
+        ):
+            full_response = response_text
+            tool_calls = current_tool_calls
+            # Apply single-pass filtering when yielding
+            # Optionally show provider info when using fallback
+            if provider_used != "Anthropic Claude" and response_text:
+                yield f"[Using {provider_used}]\n{filter_internal_tags(full_response)}"
+            else:
+                yield filter_internal_tags(full_response)
+        # Handle recursive tool calls (agent may need multiple searches)
+        tool_iteration = 0
+        # Adjust iteration limit based on query complexity
+        if is_complex_query(message):
+            max_tool_iterations = 5
+            logger.info("Complex query detected - allowing up to 5 iterations")
+        else:
+            max_tool_iterations = 3
+            logger.info("Standard query - allowing up to 3 iterations")
+        while tool_calls and tool_iteration < max_tool_iterations:
+            tool_iteration += 1
+            logger.info(f"Tool iteration {tool_iteration}: processing {len(tool_calls)} tool calls")
+            # No need to re-yield the planning phase - it was already shown
+            # Build assistant message using helper
+            assistant_content = build_assistant_message(
+                text_content=full_response,
+                tool_calls=tool_calls
+            )
+            messages.append({
+                "role": "assistant",
+                "content": assistant_content
+            })
+            # Show working indicator for long searches
+            num_tools = len(tool_calls)
+            if num_tools > 0:
+                working_text = f"\n⏳ **Searching {num_tools} database{'s' if num_tools > 1 else ''} in parallel...** "
+                if num_tools > 2:
+                    working_text += f"(this typically takes 30-45 seconds)\n"
+                elif num_tools > 1:
+                    working_text += f"(this typically takes 15-30 seconds)\n"
+                else:
+                    working_text += f"\n"
+                full_response += working_text
+                yield filter_internal_tags(full_response)  # Show working indicator immediately
+            # Execute tool calls in parallel for better performance
+            from parallel_tool_execution import execute_tool_calls_parallel
+            progress_text, tool_results_content = await execute_tool_calls_parallel(
+                tool_calls=tool_calls,
+                call_mcp_tool_func=call_mcp_tool
+            )
+            # Add progress text to full response and yield accumulated content
+            full_response += progress_text
+            if progress_text:
+                yield filter_internal_tags(full_response)  # Yield full accumulated response
+            # Add single user message with ALL tool results
+            messages.append({
+                "role": "user",
+                "content": tool_results_content
+            })
+            # Smart reflection: Only add reflection prompt if results seem incomplete
+            if tool_iteration == 1:
+                # First iteration - use normal workflow with reflection
+                # Check confidence indicators in tool results
+                results_text = str(tool_results_content).lower()
+                # Indicators of low confidence/incomplete results
+                low_confidence_indicators = [
+                    'no results found', '0 results', 'no papers',
+                    'no trials', 'limited', 'insufficient', 'few results'
+                ]
+                # Indicators of high confidence/complete results
+                high_confidence_indicators = [
+                    'recent study', 'multiple studies', 'clinical trial',
+                    'systematic review', 'meta-analysis', 'significant results'
+                ]
+                # Count confidence indicators
+                low_conf_count = sum(1 for ind in low_confidence_indicators if ind in results_text)
+                high_conf_count = sum(1 for ind in high_confidence_indicators if ind in results_text)
+                # Calculate total results found across all tools
+                import re
+                result_numbers = re.findall(r'(\d+)\s+(?:results?|papers?|studies|trials?)', results_text)
+                total_results = sum(int(n) for n in result_numbers) if result_numbers else 0
+                # Decide if reflection is needed - more aggressive skipping for performance
+                needs_reflection = (
+                    low_conf_count > 1 or  # Only if multiple low-confidence indicators
+                    (high_conf_count == 0 and total_results < 10) or  # No high confidence AND few results
+                    total_results < 3  # Almost no results at all
+                )
+                if needs_reflection:
+                    reflection_prompt = [
+                        {"type": "text", "text": "\n\n**SMART REFLECTION:** Based on the results so far, please evaluate:\n\n1. Do you have sufficient high-quality information to answer comprehensively?\n2. Are there important aspects that need more investigation?\n3. Would refining search terms or trying different databases help?\n\nIf confident with current information (found relevant studies/trials), proceed to synthesis with (**✅ ANSWER:**). Otherwise, use reflection markers (**🤔 REFLECTING:**) and search for missing information."}
+                    ]
+                    messages.append({
+                        "role": "user",
+                        "content": reflection_prompt
+                    })
+                    logger.info(f"Smart reflection triggered (low_conf:{low_conf_count}, high_conf:{high_conf_count}, results:{total_results})")
+                else:
+                    # High confidence - skip reflection and go straight to synthesis
+                    logger.info(f"Skipping reflection - high confidence (low_conf:{low_conf_count}, high_conf:{high_conf_count}, results:{total_results})")
+                    # Add a synthesis-only prompt
+                    synthesis_prompt = [
+                        {"type": "text", "text": "\n\n**HIGH CONFIDENCE RESULTS:** The search returned comprehensive information. Please proceed directly to synthesis with (**✅ SYNTHESIS:**) and provide a complete answer based on the findings."}
+                    ]
+                    messages.append({
+                        "role": "user",
+                        "content": synthesis_prompt
+                    })
+            else:
+                # Subsequent iterations (tool_iteration > 1) - UPDATE existing synthesis without repeating workflow phases
+                logger.info(f"Iteration {tool_iteration}: Updating synthesis with additional results")
+                update_prompt = [
+                    {"type": "text", "text": "\n\n**ADDITIONAL RESULTS:** You have gathered more information. Please UPDATE your previous synthesis by integrating these new findings. Do NOT repeat the planning/executing/reflecting phases - just provide an updated synthesis that incorporates both the previous and new information. Continue directly with the updated content, no phase markers needed."}
+                ]
+                messages.append({
+                    "role": "user",
+                    "content": update_prompt
+                })
+            # Second API call with tool results (with retry logic)
+            logger.info("Starting second streaming API call with tool results...")
+            logger.info(f"Messages array has {len(messages)} messages")
+            logger.info(f"Last 3 messages: {json.dumps([{'role': m.get('role'), 'content_type': type(m.get('content')).__name__, 'content_len': len(str(m.get('content')))} for m in messages[-3:]], indent=2)}")
+            # Log the actual tool results content
+            logger.info(f"Tool results content ({len(tool_results_content)} items): {json.dumps(tool_results_content[:1], indent=2) if tool_results_content else 'EMPTY'}")  # Log first item only to avoid spam
+            # Second streaming call for synthesis
+            synthesis_response = ""
+            additional_tool_calls = []
+            # For subsequent iterations, use modified system prompt that doesn't require all phases
+            iteration_system_prompt = system_prompt
+            if tool_iteration > 1:
+                iteration_system_prompt = """You are an AI assistant specializing in ALS (Amyotrophic Lateral Sclerosis) research.
+You are continuing your research with additional results. Please integrate the new findings into an updated response.
+IMPORTANT: Do NOT repeat the workflow phases (Planning/Executing/Reflecting/Synthesis) - you've already done those.
+Simply provide updated content that incorporates both previous and new information.
+Start your response directly with the updated information, no phase markers needed."""
+            # Limit tools on subsequent iterations to prevent endless loops
+            available_tools = tools if tool_iteration == 1 else []  # No more tools after first iteration
+            async for response_text, current_tool_calls, provider_used in stream_with_retry(
+                client=client,
+                messages=messages,
+                tools=available_tools,
+                system_prompt=iteration_system_prompt,
+                max_retries=2,
+                model=ANTHROPIC_MODEL,
+                max_tokens=MAX_RESPONSE_TOKENS,
+                stream_name="synthesis API call"
+            ):
+                synthesis_response = response_text
+                additional_tool_calls = current_tool_calls
+            full_response += synthesis_response
+            # Yield the full accumulated response including planning, execution, and synthesis
+            yield filter_internal_tags(full_response)
+            # Check for additional tool calls
+            if additional_tool_calls:
+                logger.info(f"Found {len(additional_tool_calls)} recursive tool calls")
+                # Check if we're about to hit the iteration limit
+                if tool_iteration >= (max_tool_iterations - 1):  # Last iteration before limit
+                    # We're on the last allowed iteration
+                    logger.info(f"Approaching iteration limit ({max_tool_iterations}), wrapping up with current results")
+                    # Don't execute more tools, instead trigger final synthesis
+                    # Add a user message to force final synthesis without tools
+                    messages.append({
+                        "role": "user",
+                        "content": [{"type": "text", "text": "Please provide a complete synthesis of all the information you've found so far. No more searches are available - summarize what you've discovered."}]
+                    })
+                    # Make one final API call to synthesize all the results
+                    final_synthesis = ""
+                    async for response_text, _, provider_used in stream_with_retry(
+                        client=client,
+                        messages=messages,
+                        tools=[],  # No tools for final synthesis
+                        system_prompt=system_prompt,
+                        max_retries=1,
+                        model=ANTHROPIC_MODEL,
+                        max_tokens=MAX_RESPONSE_TOKENS,
+                        stream_name="final synthesis"
+                    ):
+                        final_synthesis = response_text
+                    full_response += final_synthesis
+                    # Yield the full accumulated response
+                    yield filter_internal_tags(full_response)
+                    # Clear tool_calls to exit the loop gracefully
+                    tool_calls = []
+                else:
+                    # We have room for more iterations, proceed normally
+                    # Build assistant message for recursive calls
+                    assistant_content = build_assistant_message(
+                        text_content=synthesis_response,
+                        tool_calls=additional_tool_calls
+                    )
+                    messages.append({
+                        "role": "assistant",
+                        "content": assistant_content
+                    })
+                    # Execute recursive tool calls
+                    progress_text, tool_results_content = await execute_tool_calls(
+                        tool_calls=additional_tool_calls,
+                        call_mcp_tool_func=call_mcp_tool
+                    )
+                    full_response += progress_text
+                    # Yield the full accumulated response
+                    if progress_text:
+                        yield filter_internal_tags(full_response)
+                    # Add results and continue loop
+                    messages.append({
+                        "role": "user",
+                        "content": tool_results_content
+                    })
+                    # Set tool_calls for next iteration
+                    tool_calls = additional_tool_calls
+            else:
+                # No more tool calls, exit loop
+                tool_calls = []
+        if tool_iteration >= max_tool_iterations:
+            logger.warning(f"Reached maximum tool iterations ({max_tool_iterations})")
+        # Force synthesis if we haven't provided one yet
+        if tool_iteration > 0 and "✅ SYNTHESIS:" not in full_response:
+            logger.warning(f"No synthesis found after {tool_iteration} iterations, forcing synthesis")
+            # Add a forced synthesis prompt
+            synthesis_prompt_content = [{"type": "text", "text": "You MUST now provide a ✅ SYNTHESIS phase. Synthesize whatever information you've gathered, even if searches were limited. If you couldn't find specific research, provide knowledge-based answers with appropriate caveats."}]
+            messages.append({
+                "role": "user",
+                "content": synthesis_prompt_content
+            })
+            # Make final synthesis call without tools
+            forced_synthesis = ""
+            async for response_text, _, _ in stream_with_retry(
+                client=client,
+                messages=messages,
+                tools=[],  # No tools - just synthesize
+                system_prompt=system_prompt,
+                max_retries=1,
+                model=ANTHROPIC_MODEL,
+                max_tokens=MAX_RESPONSE_TOKENS,
+                stream_name="forced synthesis"
+            ):
+                forced_synthesis = response_text
+            full_response += "\n\n" + forced_synthesis
+            # Yield the full accumulated response with forced synthesis
+            yield filter_internal_tags(full_response)
+        # No final yield needed - response has already been yielded incrementally
+        # Record successful response time
+        response_time = time.time() - start_time
+        health_monitor.record_response_time(response_time)
+        logger.info(f"Request completed in {response_time:.2f} seconds")
+    except Exception as e:
+        logger.error(f"Error in als_research_agent: {e}", exc_info=True)
+        health_monitor.record_error(str(e))
+        error_message = format_error_message(e, context=f"Processing query: {message[:100]}...")
+        yield error_message
+# Gradio Interface
+async def main() -> None:
+    """Main function to setup and launch the Gradio interface"""
+    global cleanup_task
+    try:
+        # Setup MCP servers
+        logger.info("Setting up MCP servers...")
+        await setup_mcp_servers()
+        logger.info("MCP servers initialized successfully")
+        # Start memory cleanup task
+        cleanup_task = asyncio.create_task(cleanup_memory())
+        logger.info("Memory cleanup task started")
+    except Exception as e:
+        logger.error(f"Failed to initialize MCP servers: {e}", exc_info=True)
+        raise
+    # Create Gradio interface with export button
+    with gr.Blocks() as demo:
+        gr.Markdown("# 🧬 ALSARA - ALS Agentic Research Assistant ")
+        gr.Markdown("Ask questions about ALS research, treatments, and clinical trials. This agent searches PubMed, AACT clinical trials database, and other sources in real-time.")
+        # Show LLM configuration status using unified client
+        llm_status = f"🤖 **LLM Provider:** {client.get_provider_display_name()}"
+        gr.Markdown(llm_status)
+        with gr.Tabs():
+            with gr.TabItem("Chat"):
+                chatbot = gr.Chatbot(
+                    height=600,
+                    show_label=False,
+                    allow_tags=True,  # Allow custom HTML tags from LLMs (Gradio 6 default)
+                    elem_classes="chatbot-container"
+                )
+            with gr.TabItem("System Health"):
+                gr.Markdown("## 📊 System Health Monitor")
+                def format_health_status():
+                    """Format health status for display"""
+                    status = health_monitor.get_health_status()
+                    return f"""
+**Status:** {status['status'].upper()} {'✅' if status['status'] == 'healthy' else '⚠️'}
+**Uptime:** {status['uptime_seconds'] / 3600:.1f} hours
+**Total Requests:** {status['request_count']}
+**Error Rate:** {status['error_rate']:.1%}
+**Avg Response Time:** {status['avg_response_time']:.2f}s
+**Cache Status:**
+- Cache Size: {status['cache_size']} items
+- Rate Limiter: {status['rate_limit_status']}
+**Most Used Tools:**
+{chr(10).join([f"- {tool}: {count} calls" for tool, count in status['most_used_tools'].items()])}
+**Last Error:** {status['last_error']['error'] if status['last_error'] else 'None'}
+                    """
+                health_display = gr.Markdown(format_health_status())
+                refresh_btn = gr.Button("🔄 Refresh Health Status")
+                refresh_btn.click(fn=format_health_status, outputs=health_display)
+        with gr.Row():
+            with gr.Column(scale=6):
+                msg = gr.Textbox(
+                    placeholder="Ask about ALS research, treatments, or clinical trials...",
+                    container=False,
+                    label="Type your question or use voice input"
+                )
+            with gr.Column(scale=1):
+                audio_input = gr.Audio(
+                    sources=["microphone"],
+                    type="filepath",
+                    label="🎤 Voice Input"
+                )
+            export_btn = gr.DownloadButton("💾 Export", scale=1)
+        with gr.Row():
+            submit_btn = gr.Button("Submit", variant="primary")
+            retry_btn = gr.Button("🔄 Retry")
+            undo_btn = gr.Button("↩️ Undo")
+            clear_btn = gr.Button("🗑️ Clear")
+            speak_btn = gr.Button("🔊 Read Last Response", variant="secondary", interactive=False)
+        # Audio output component (initially hidden)
+        with gr.Row(visible=False) as audio_row:
+            audio_output = gr.Audio(
+                label="🔊 Voice Output",
+                type="filepath",
+                autoplay=True,
+                visible=True
+            )
+        gr.Examples(
+            examples=[
+                "Psilocybin trials and use in therapy",
+                "Role of Omega-3 and omega-6 fatty acids in ALS treatment",
+                "List the main genes that should be tested for ALS gene therapy eligibility",
+                "What are the latest SOD1-targeted therapies in recent preprints?",
+                "Find recruiting clinical trials for bulbar-onset ALS",
+                "Explain the role of TDP-43 in ALS pathology",
+                "What is the current status of tofersen clinical trials?",
+                "Are there any new combination therapies being studied?",
+                "What's the latest research on ALS biomarkers from the past 60 days?",
+                "Search PubMed for recent ALS gene therapy research"
+            ],
+            inputs=msg
+        )
+        # Chat interface logic with improved error handling
+        async def respond(message: str, history: Optional[List[Dict[str, str]]]) -> AsyncGenerator[List[Dict[str, str]], None]:
+            history = history or []
+            # Append user message
+            history.append({"role": "user", "content": message})
+            # Append empty assistant message
+            history.append({"role": "assistant", "content": ""})
+            try:
+                # Pass history without the new messages to als_research_agent
+                async for response in als_research_agent(message, history[:-2]):
+                    # Update the last assistant message in place
+                    history[-1]['content'] = response
+                    yield history
+            except Exception as e:
+                logger.error(f"Error in respond: {e}", exc_info=True)
+                error_msg = f"❌ Error: {str(e)}"
+                history[-1]['content'] = error_msg
+                yield history
+        def update_speak_button():
+            """Update the speak button state based on last_response_was_research"""
+            global last_response_was_research
+            return gr.update(interactive=last_response_was_research)
+        def undo_last(history: Optional[List[Dict[str, str]]]) -> Optional[List[Dict[str, str]]]:
+            """Remove the last message pair from history"""
+            if history and len(history) >= 2:
+                # Remove last user message and assistant response
+                return history[:-2]
+            return history
+        async def retry_last(history: Optional[List[Dict[str, str]]]) -> AsyncGenerator[List[Dict[str, str]], None]:
+            """Retry the last query with error handling"""
+            if history and len(history) >= 2:
+                # Get the last user message
+                last_user_msg = history[-2]["content"] if history[-2]["role"] == "user" else None
+                if last_user_msg:
+                    # Remove last assistant message, keep user message
+                    history = history[:-1]
+                    # Add new empty assistant message
+                    history.append({"role": "assistant", "content": ""})
+                    try:
+                        # Resubmit (pass history without the last user and assistant messages)
+                        async for response in als_research_agent(last_user_msg, history[:-2]):
+                            # Update the last assistant message in place
+                            history[-1]['content'] = response
+                            yield history
+                    except Exception as e:
+                        logger.error(f"Error in retry_last: {e}", exc_info=True)
+                        error_msg = f"❌ Error during retry: {str(e)}"
+                        history[-1]['content'] = error_msg
+                        yield history
+                else:
+                    yield history
+            else:
+                yield history
+        async def process_voice_input(audio_file):
+            """Process voice input and convert to text"""
+            try:
+                if audio_file is None:
+                    return ""
+                # Try to use speech recognition if available
+                try:
+                    import speech_recognition as sr
+                    recognizer = sr.Recognizer()
+                    # Load audio file
+                    with sr.AudioFile(audio_file) as source:
+                        audio_data = recognizer.record(source)
+                    # Use Google's free speech recognition
+                    try:
+                        text = recognizer.recognize_google(audio_data)
+                        logger.info(f"Voice input transcribed: {text[:50]}...")
+                        return text
+                    except sr.UnknownValueError:
+                        logger.warning("Could not understand audio")
+                        return ""
+                    except sr.RequestError as e:
+                        logger.error(f"Speech recognition service error: {e}")
+                        return ""
+                except ImportError:
+                    logger.warning("speech_recognition not available")
+                    return ""
+            except Exception as e:
+                logger.error(f"Error processing voice input: {e}")
+                return ""
+        async def speak_last_response(history: Optional[List[Dict[str, str]]]) -> Tuple[gr.update, gr.update]:
+            """Convert the last assistant response to speech using ElevenLabs"""
+            try:
+                # Check if the last response was from research workflow
+                global last_response_was_research
+                if not last_response_was_research:
+                    # This shouldn't happen since button is disabled, but handle it gracefully
+                    logger.info("Last response was not research-based, voice synthesis not available")
+                    return gr.update(visible=False), gr.update(value=None)
+                # Check ELEVENLABS_API_KEY
+                api_key = os.getenv("ELEVENLABS_API_KEY")
+                if not api_key:
+                    logger.warning("No ELEVENLABS_API_KEY configured")
+                    return gr.update(visible=True), gr.update(
+                        value=None,
+                        label="⚠️ Voice service unavailable - Please set ELEVENLABS_API_KEY"
+                    )
+                if not history or len(history) < 1:
+                    logger.warning("No history available for text-to-speech")
+                    return gr.update(visible=True), gr.update(
+                        value=None,
+                        label="⚠️ No conversation history to read"
+                    )
+                # Get the last assistant response
+                last_response = None
+                # Detect and handle different history formats
+                if isinstance(history, list) and len(history) > 0:
+                    # Check if history is a list of lists (Gradio chatbot format)
+                    if isinstance(history[0], list) and len(history[0]) == 2:
+                        # Format: [[user_msg, assistant_msg], ...]
+                        logger.info("Detected Gradio list-of-lists history format")
+                        for i, exchange in enumerate(reversed(history)):
+                            if len(exchange) == 2 and exchange[1]:  # assistant message is second
+                                last_response = exchange[1]
+                                break
+                    elif isinstance(history[0], dict):
+                        # Format: [{"role": "user", "content": "..."}, ...]
+                        logger.info("Detected dict-based history format")
+                        for i, msg in enumerate(reversed(history)):
+                            if msg.get("role") == "assistant" and msg.get("content"):
+                                content = msg["content"]
+                                # CRITICAL FIX: Handle Claude API content blocks
+                                if isinstance(content, list):
+                                    # Extract text from content blocks
+                                    text_parts = []
+                                    for block in content:
+                                        if isinstance(block, dict):
+                                            # Handle text block
+                                            if block.get("type") == "text" and "text" in block:
+                                                text_parts.append(block["text"])
+                                            # Handle string content in dict
+                                            elif "content" in block and isinstance(block["content"], str):
+                                                text_parts.append(block["content"])
+                                        elif isinstance(block, str):
+                                            text_parts.append(block)
+                                    last_response = "\n".join(text_parts)
+                                else:
+                                    # Content is already a string
+                                    last_response = content
+                                break
+                    elif isinstance(history[0], str):
+                        # Simple string list - take the last one
+                        logger.info("Detected simple string list history format")
+                        last_response = history[-1] if history else None
+                    else:
+                        # Unknown format - try to extract what we can
+                        logger.warning(f"Unknown history format: {type(history[0])}")
+                        # Try to convert to string as last resort
+                        try:
+                            last_response = str(history[-1]) if history else None
+                        except Exception as e:
+                            logger.error(f"Failed to extract last response: {e}")
+                if not last_response:
+                    logger.warning("No assistant response found in history")
+                    return gr.update(visible=True), gr.update(
+                        value=None,
+                        label="⚠️ No assistant response found to read"
+                    )
+                # Clean the response text (remove markdown, internal tags, etc.)
+                # Convert to string if not already (safety check)
+                last_response = str(last_response)
+                # IMPORTANT: Extract only the synthesis/main answer, skip references and "for more information"
+                # Find where to cut off the response
+                cutoff_patterns = [
+                    # Clear section headers with colons - most reliable indicators
+                    r'\n\s*(?:For (?:more|additional|further) (?:information|details|reading))\s*[:：]',
+                    r'\n\s*(?:References?|Sources?|Citations?|Bibliography)\s*[:：]',
+                    r'\n\s*(?:Additional (?:resources?|information|reading|materials?))\s*[:：]',
+                    # Markdown headers for reference sections (must be on their own line)
+                    r'\n\s*#{1,6}\s+(?:References?|Sources?|Citations?|Bibliography)\s*$',
+                    r'\n\s*#{1,6}\s+(?:For (?:more|additional|further) (?:information|details))\s*$',
+                    r'\n\s*#{1,6}\s+(?:Additional (?:Resources?|Information|Reading))\s*$',
+                    r'\n\s*#{1,6}\s+(?:Further Reading|Learn More)\s*$',
+                    # Bold headers for reference sections (with newline after)
+                    r'\n\s*\*\*(?:References?|Sources?|Citations?)\*\*\s*[:：]?\s*\n',
+                    r'\n\s*\*\*(?:For (?:more|additional) information)\*\*\s*[:：]?\s*\n',
+                    # Phrases that clearly introduce reference lists
+                    r'\n\s*(?:Here are|Below are|The following are)\s+(?:the |some |additional )?(?:references|sources|citations|papers cited|studies referenced)',
+                    r'\n\s*(?:References used|Sources consulted|Papers cited|Studies referenced)\s*[:：]',
+                    r'\n\s*(?:Key|Recent|Selected|Relevant)\s+(?:references?|publications?|citations)\s*[:：]',
+                    # Clinical trials section headers with clear separators
+                    r'\n\s*(?:Clinical trials?|Studies|Research papers?)\s+(?:referenced|cited|mentioned|used)\s*[:：]',
+                    r'\n\s*(?:AACT|ClinicalTrials\.gov)\s+(?:database entries?|trial IDs?|references?)\s*[:：]',
+                    # Web link sections
+                    r'\n\s*(?:Links?|URLs?|Websites?|Web resources?)\s*[:：]',
+                    r'\n\s*(?:Visit|See|Check out)\s+(?:these|the following)\s+(?:links?|websites?|resources?)',
+                    r'\n\s*(?:Learn more|Read more|Find out more|Get more information)\s+(?:at|here|below)\s*[:：]',
+                    # Academic citation lists (only when preceded by double newline or clear separator)
+                    r'\n\n\s*\d+\.\s+[A-Z][a-z]+.*?et al\..*?(?:PMID|DOI|Journal)',
+                    r'\n\n\s*\[1\]\s+[A-Z][a-z]+.*?(?:et al\.|https?://)',
+                    # Direct ID listings (clearly separate from main content)
+                    r'\n\s*(?:PMID|DOI|NCT)\s*[:：]\s*\d+',
+                    r'\n\s*(?:Trial IDs?|Study IDs?)\s*[:：]',
+                    # Footer sections
+                    r'\n\s*(?:Note|Notes|Disclaimer|Important notice)\s*[:：]',
+                    r'\n\s*(?:Data (?:source|from)|Database|Repository)\s*[:：]',
+                    r'\n\s*(?:Retrieved from|Accessed via|Source database)\s*[:：]',
+                ]
+                # FIRST: Extract ONLY the synthesis section (after ✅ SYNTHESIS:)
+                # More robust pattern that handles various formatting
+                synthesis_patterns = [
+                    r'✅\s*\*{0,2}SYNTHESIS\*{0,2}\s*:?\s*\n+(.*)',  # Standard format with newline
+                    r'\*\*✅\s*SYNTHESIS:\*\*\s*(.*)',  # Bold format
+                    r'✅\s*SYNTHESIS:\s*(.*)',  # Simple format
+                    r'SYNTHESIS:\s*(.*)',  # Fallback without emoji
+                ]
+                synthesis_text = None
+                for pattern in synthesis_patterns:
+                    synthesis_match = re.search(pattern, last_response, re.IGNORECASE | re.DOTALL)
+                    if synthesis_match:
+                        synthesis_text = synthesis_match.group(1)
+                        logger.info(f"Extracted synthesis section using pattern: {pattern[:30]}...")
+                        break
+                if synthesis_text:
+                    logger.info("Extracted synthesis section for voice reading")
+                else:
+                    # Fallback: if no synthesis marker found, use the whole response
+                    synthesis_text = last_response
+                    logger.info("No synthesis marker found, using full response")
+                # THEN: Remove references and footer sections
+                for pattern in cutoff_patterns:
+                    match = re.search(pattern, synthesis_text, re.IGNORECASE | re.MULTILINE)
+                    if match:
+                        synthesis_text = synthesis_text[:match.start()]
+                        logger.info(f"Truncated response at pattern: {pattern[:50]}...")
+                        break
+                # Now clean the synthesis text
+                clean_text = re.sub(r'\*\*(.*?)\*\*', r'\1', synthesis_text)  # Remove bold
+                clean_text = re.sub(r'\*(.*?)\*', r'\1', clean_text)  # Remove italic
+                clean_text = re.sub(r'#{1,6}\s*(.*?)\n', r'\1. ', clean_text)  # Remove headers
+                clean_text = re.sub(r'```.*?```', '', clean_text, flags=re.DOTALL)  # Remove code blocks
+                clean_text = re.sub(r'`(.*?)`', r'\1', clean_text)  # Remove inline code
+                clean_text = re.sub(r'\[([^\]]+)\]\([^\)]+\)', r'\1', clean_text)  # Remove links
+                clean_text = re.sub(r'<[^>]+>', '', clean_text)  # Remove HTML tags
+                clean_text = re.sub(r'\n{3,}', '\n\n', clean_text)  # Reduce multiple newlines
+                # Strip leading/trailing whitespace
+                clean_text = clean_text.strip()
+                # Ensure we have something to read
+                if not clean_text or len(clean_text) < 10:
+                    logger.warning("Synthesis text too short after cleaning, using original")
+                    clean_text = last_response[:2500]  # Fallback to first 2500 chars
+                # Check if ElevenLabs server is available
+                try:
+                    server_tools = await mcp_manager.list_all_tools()
+                    elevenlabs_available = any('elevenlabs' in tool for tool in server_tools.keys())
+                    if not elevenlabs_available:
+                        logger.error("ElevenLabs server not available in MCP tools")
+                        return gr.update(visible=True), gr.update(
+                            value=None,
+                            label="⚠️ Voice service not available - Please set ELEVENLABS_API_KEY"
+                        )
+                except Exception as e:
+                    logger.error(f"Failed to check ElevenLabs availability: {e}", exc_info=True)
+                    return gr.update(visible=True), gr.update(
+                        value=None,
+                        label="⚠️ Voice service not available"
+                    )
+                # Remove phase markers from text
+                clean_text = re.sub(r'\*\*[🎯🔧🤔✅].*?:\*\*', '', clean_text)
+                # Call ElevenLabs text-to-speech through MCP
+                logger.info(f"Calling ElevenLabs text-to-speech with {len(clean_text)} characters...")
+                try:
+                    result = await call_mcp_tool(
+                        "elevenlabs__text_to_speech",
+                        {"text": clean_text, "speed": 0.95}  # Slightly slower for clarity
+                    )
+                except Exception as e:
+                    logger.error(f"MCP tool call failed: {e}", exc_info=True)
+                    raise
+                # Parse the result
+                try:
+                    result_data = json.loads(result) if isinstance(result, str) else result
+                    # Check for API key error
+                    if "ELEVENLABS_API_KEY not configured" in str(result):
+                        logger.error("ElevenLabs API key not configured - found in result string")
+                        return gr.update(visible=True), gr.update(
+                            value=None,
+                            label="⚠️ Voice service unavailable - Please set ELEVENLABS_API_KEY environment variable"
+                        )
+                    if result_data.get("status") == "success" and result_data.get("audio_base64"):
+                        # Save audio to temporary file
+                        with tempfile.NamedTemporaryFile(delete=False, suffix=".mp3") as tmp_file:
+                            audio_data = base64.b64decode(result_data["audio_base64"])
+                            tmp_file.write(audio_data)
+                            audio_path = tmp_file.name
+                        logger.info(f"Audio successfully generated and saved to: {audio_path}")
+                        return gr.update(visible=True), gr.update(
+                            value=audio_path,
+                            visible=True,
+                            label="🔊 Click to play voice output"
+                        )
+                    elif result_data.get("status") == "error":
+                        error_msg = result_data.get("message", "Unknown error")
+                        error_type = result_data.get("error", "Unknown")
+                        logger.error(f"ElevenLabs error - Type: {error_type}, Message: {error_msg}")
+                        return gr.update(visible=True), gr.update(
+                            value=None,
+                            label=f"⚠️ Voice service error: {error_msg}"
+                        )
+                    else:
+                        logger.error(f"Unexpected result structure")
+                        return gr.update(visible=True), gr.update(
+                            value=None,
+                            label="⚠️ Voice service returned no audio"
+                        )
+                except json.JSONDecodeError as e:
+                    logger.error(f"JSON decode error: {e}")
+                    logger.error(f"Failed to parse ElevenLabs response, first 500 chars: {str(result)[:500]}")
+                    return gr.update(visible=True), gr.update(
+                        value=None,
+                        label="⚠️ Voice service response error"
+                    )
+                except Exception as e:
+                    logger.error(f"Unexpected error in result parsing: {e}", exc_info=True)
+                    raise
+            except Exception as e:
+                logger.error(f"Error in speak_last_response: {e}", exc_info=True)
+                return gr.update(visible=True), gr.update(
+                    value=None,
+                    label=f"⚠️ Voice service error: {str(e)}"
+                )
+        msg.submit(
+            respond, [msg, chatbot], [chatbot],
+            api_name="chat"
+        ).then(
+            update_speak_button, None, [speak_btn]
+        ).then(
+            lambda: "", None, [msg]
+        )
+        # Add event handler for audio input
+        audio_input.stop_recording(
+            process_voice_input,
+            inputs=[audio_input],
+            outputs=[msg]
+        ).then(
+            lambda: None,
+            outputs=[audio_input]  # Clear audio after processing
+        )
+        submit_btn.click(
+            respond, [msg, chatbot], [chatbot],
+            api_name="chat_button"
+        ).then(
+            update_speak_button, None, [speak_btn]
+        ).then(
+            lambda: "", None, [msg]
+        )
+        retry_btn.click(
+            retry_last, [chatbot], [chatbot],
+            api_name="retry"
+        ).then(
+            update_speak_button, None, [speak_btn]
+        )
+        undo_btn.click(
+            undo_last, [chatbot], [chatbot],
+            api_name="undo"
+        )
+        clear_btn.click(
+            lambda: None, None, chatbot,
+            queue=False,
+            api_name="clear"
+        ).then(
+            lambda: gr.update(interactive=False), None, [speak_btn]
+        )
+        export_btn.click(
+            export_conversation, chatbot, export_btn,
+            api_name="export"
+        )
+        speak_btn.click(
+            speak_last_response, [chatbot], [audio_row, audio_output],
+            api_name="speak"
+        )
+    # Enable queue for streaming to work
+    demo.queue()
+    try:
+        # Use environment variable for port, default to 7860 for HuggingFace
+        port = int(os.environ.get("GRADIO_SERVER_PORT", 7860))
+        demo.launch(
+            server_name="0.0.0.0",
+            server_port=port,
+            share=False
+        )
+    except KeyboardInterrupt:
+        logger.info("Received keyboard interrupt, shutting down...")
+    except Exception as e:
+        logger.error(f"Error during launch: {e}", exc_info=True)
+    finally:
+        # Cleanup
+        logger.info("Cleaning up resources...")
+        await cleanup_mcp_servers()
+if __name__ == "__main__":
+    try:
+        asyncio.run(main())
+    except KeyboardInterrupt:
+        logger.info("Application terminated by user")
+    except Exception as e:
+        logger.error(f"Application error: {e}", exc_info=True)
+        raise
+    finally:
+        # Cancel cleanup task if running
+        if cleanup_task and not cleanup_task.done():
+            cleanup_task.cancel()
+            logger.info("Cancelled memory cleanup task")
+        # Cleanup unified LLM client
+        if client is not None:
+            try:
+                asyncio.run(client.cleanup())
+                logger.info("LLM client cleanup completed")
+            except Exception as e:
+                logger.warning(f"LLM client cleanup error: {e}")
+                pass

custom_mcp_client.py ADDED Viewed

	@@ -0,0 +1,241 @@

+"""
+Custom MCP client using direct subprocess communication.
+This bypasses the buggy stdio_client from mcp.client.stdio.
+"""
+import asyncio
+import json
+import logging
+import subprocess
+import sys
+from pathlib import Path
+from typing import Any, Dict, List, Optional
+logger = logging.getLogger(__name__)
+class MCPClient:
+    """Custom MCP client using direct subprocess communication"""
+    def __init__(self, server_script: str, server_name: str):
+        self.server_script = server_script
+        self.server_name = server_name
+        self.process: Optional[subprocess.Popen] = None
+        self.message_id = 0
+        self._initialized = False
+        self.script_path = server_script  # Store for potential restart
+    async def start(self):
+        """Start the MCP server subprocess"""
+        logger.info(f"Starting MCP server: {self.server_name}")
+        self.process = subprocess.Popen(
+            [sys.executable, self.server_script],
+            stdin=subprocess.PIPE,
+            stdout=subprocess.PIPE,
+            stderr=subprocess.PIPE,
+            text=True,
+            bufsize=1  # Line-buffered I/O to prevent 8KB truncation
+        )
+        # Initialize the session
+        await self._initialize()
+        logger.info(f"Successfully started MCP server: {self.server_name}")
+    async def _initialize(self):
+        """Initialize the MCP session"""
+        init_message = {
+            "jsonrpc": "2.0",
+            "id": self._next_id(),
+            "method": "initialize",
+            "params": {
+                "protocolVersion": "2024-11-05",
+                "capabilities": {},
+                "clientInfo": {
+                    "name": "als-research-agent",
+                    "version": "1.0.0"
+                }
+            }
+        }
+        response = await self._send_request(init_message)
+        if "result" in response:
+            self._initialized = True
+            logger.info(f"Initialized {self.server_name}: {response['result'].get('serverInfo', {})}")
+        else:
+            raise Exception(f"Initialization failed: {response}")
+    def _next_id(self) -> int:
+        """Get next message ID"""
+        self.message_id += 1
+        return self.message_id
+    async def _send_request(self, message: Dict[str, Any]) -> Dict[str, Any]:
+        """Send a JSON-RPC request and wait for response"""
+        if not self.process:
+            raise RuntimeError("Server not started")
+        # Check if process is still alive
+        if self.process.poll() is not None:
+            # Process has terminated
+            raise RuntimeError(f"Server {self.server_name} has terminated unexpectedly")
+        # Send request
+        request_json = json.dumps(message) + "\n"
+        self.process.stdin.write(request_json)
+        self.process.stdin.flush()
+        # Read response with timeout
+        try:
+            response_line = await asyncio.wait_for(
+                asyncio.to_thread(self.process.stdout.readline),
+                timeout=60.0  # Extended timeout for LlamaIndex/RAG server initialization
+            )
+            if not response_line:
+                raise Exception("Server closed stdout")
+            return json.loads(response_line)
+        except asyncio.TimeoutError:
+            raise Exception("Request timed out")
+    async def list_tools(self) -> List[Dict[str, Any]]:
+        """List available tools"""
+        if not self._initialized:
+            raise RuntimeError("Client not initialized")
+        message = {
+            "jsonrpc": "2.0",
+            "id": self._next_id(),
+            "method": "tools/list",
+            "params": {}
+        }
+        response = await self._send_request(message)
+        if "result" in response:
+            return response["result"].get("tools", [])
+        else:
+            raise Exception(f"List tools failed: {response}")
+    async def call_tool(self, tool_name: str, arguments: Dict[str, Any]) -> str:
+        """Call a tool"""
+        if not self._initialized:
+            raise RuntimeError("Client not initialized")
+        message = {
+            "jsonrpc": "2.0",
+            "id": self._next_id(),
+            "method": "tools/call",
+            "params": {
+                "name": tool_name,
+                "arguments": arguments
+            }
+        }
+        response = await self._send_request(message)
+        if "result" in response:
+            # Extract result from response
+            result = response["result"]
+            # Handle different response formats
+            if isinstance(result, dict):
+                # New format with 'result' field
+                if "result" in result:
+                    return result["result"]
+                # Content array format
+                elif "content" in result:
+                    content = result["content"]
+                    if isinstance(content, list) and len(content) > 0:
+                        return content[0].get("text", str(content))
+                    return str(content)
+                else:
+                    return str(result)
+            else:
+                return str(result)
+        else:
+            error = response.get("error", {})
+            raise Exception(f"Tool call failed: {error.get('message', response)}")
+    async def close(self):
+        """Close the MCP client and terminate server"""
+        if self.process:
+            logger.info(f"Closing MCP server: {self.server_name}")
+            self.process.terminate()
+            try:
+                self.process.wait(timeout=5)
+            except subprocess.TimeoutExpired:
+                self.process.kill()
+                self.process.wait()
+            self.process = None
+            self._initialized = False
+class MCPClientManager:
+    """Manage multiple MCP clients"""
+    def __init__(self):
+        self.clients: Dict[str, MCPClient] = {}
+    async def add_server(self, name: str, script_path: str):
+        """Add and start an MCP server"""
+        client = MCPClient(script_path, name)
+        await client.start()
+        self.clients[name] = client
+        logger.info(f"Added MCP server: {name}")
+    async def call_tool(self, server_name: str, tool_name: str, arguments: Dict[str, Any]) -> str:
+        """Call a tool on a specific server"""
+        if server_name not in self.clients:
+            raise ValueError(f"Server not found: {server_name}")
+        return await self.clients[server_name].call_tool(tool_name, arguments)
+    async def list_all_tools(self) -> Dict[str, List[Dict[str, Any]]]:
+        """List tools from all servers, handling failures gracefully"""
+        all_tools = {}
+        failed_servers = []
+        for name, client in self.clients.items():
+            try:
+                tools = await client.list_tools()
+                for tool in tools:
+                    tool['server'] = name  # Add server info to each tool
+                all_tools[name] = tools
+            except Exception as e:
+                logger.error(f"Failed to list tools from server {name}: {e}")
+                failed_servers.append(name)
+                # Continue with other servers instead of failing entirely
+                all_tools[name] = []
+        if failed_servers:
+            logger.warning(f"Some servers failed to respond: {', '.join(failed_servers)}")
+            # Try to restart failed servers
+            for server_name in failed_servers:
+                try:
+                    client = self.clients[server_name]
+                    script_path = client.script_path if hasattr(client, 'script_path') else None
+                    if script_path:
+                        logger.info(f"Attempting to restart {server_name} server...")
+                        await client.close()
+                        # Re-add the server (which will restart it)
+                        await self.add_server(server_name, script_path)
+                        # Try listing tools again after restart
+                        tools = await self.clients[server_name].list_tools()
+                        for tool in tools:
+                            tool['server'] = server_name
+                        all_tools[server_name] = tools
+                        logger.info(f"Successfully restarted {server_name} server")
+                except Exception as restart_error:
+                    logger.error(f"Failed to restart {server_name}: {restart_error}")
+                    # Remove the failed server from clients to prevent further errors
+                    if server_name in self.clients:
+                        del self.clients[server_name]
+        return all_tools
+    async def close_all(self):
+        """Close all MCP clients"""
+        for client in self.clients.values():
+            await client.close()
+        self.clients.clear()
+        logger.info("All MCP servers closed")

llm_client.py ADDED Viewed

	@@ -0,0 +1,439 @@

+#!/usr/bin/env python3
+"""
+Unified LLM Client - Single interface for all LLM providers
+Handles Anthropic, SambaNova, and automatic fallback logic internally
+"""
+import os
+import logging
+import asyncio
+import httpx
+from typing import AsyncGenerator, List, Dict, Any, Optional, Tuple
+from anthropic import AsyncAnthropic
+from dotenv import load_dotenv
+load_dotenv()
+logger = logging.getLogger(__name__)
+class UnifiedLLMClient:
+    """
+    Unified client that abstracts all LLM provider logic.
+    Provides a single, clean interface to the application.
+    """
+    def __init__(self):
+        """Initialize the unified client with automatic provider selection"""
+        self.primary_client = None
+        self.fallback_router = None
+        self.provider_name = None
+        self.config = self._load_configuration()
+        self._initialize_providers()
+    def _load_configuration(self) -> Dict[str, Any]:
+        """Load configuration from environment variables"""
+        return {
+            "anthropic_api_key": os.getenv("ANTHROPIC_API_KEY"),
+            "use_fallback": os.getenv("USE_FALLBACK_LLM", "false").lower() == "true",
+            "provider_preference": os.getenv("LLM_PROVIDER_PREFERENCE", "auto"),
+            "default_model": os.getenv("ANTHROPIC_MODEL", "claude-sonnet-4-5-20250929"),
+            "max_retries": int(os.getenv("LLM_MAX_RETRIES", "2")),
+            "is_hf_space": os.getenv("SPACE_ID") is not None,
+            "enable_smart_routing": os.getenv("ENABLE_SMART_ROUTING", "false").lower() == "true"
+        }
+    def _initialize_providers(self):
+        """Initialize LLM providers based on configuration"""
+        # Try to initialize Anthropic first
+        if self.config["anthropic_api_key"]:
+            try:
+                self.primary_client = AsyncAnthropic(api_key=self.config["anthropic_api_key"])
+                self.provider_name = "Anthropic Claude"
+                logger.info("Anthropic client initialized successfully")
+            except Exception as e:
+                logger.warning(f"Failed to initialize Anthropic client: {e}")
+                self.primary_client = None
+        # Initialize fallback if needed
+        if self.config["use_fallback"] or not self.primary_client:
+            try:
+                from llm_providers import llm_router
+                self.fallback_router = llm_router
+                if not self.primary_client:
+                    self.provider_name = "SambaNova Llama 3.3 70B"
+                    logger.info("Using SambaNova as primary provider")
+                else:
+                    logger.info("SambaNova fallback configured for automatic failover")
+            except ImportError:
+                logger.warning("Fallback LLM provider not available")
+                if not self.primary_client:
+                    self._raise_configuration_error()
+    def _raise_configuration_error(self):
+        """Raise appropriate error for missing configuration"""
+        if self.config["is_hf_space"]:
+            raise ValueError(
+                "🚨 No LLM provider configured!\n\n"
+                "Option 1: Add your Anthropic API key as a Space secret:\n"
+                "1. Go to your Space Settings\n"
+                "2. Add secret: ANTHROPIC_API_KEY = your_key\n\n"
+                "Option 2: Enable free SambaNova fallback:\n"
+                "Add secret: USE_FALLBACK_LLM = true"
+            )
+        else:
+            raise ValueError(
+                "No LLM provider configured.\n\n"
+                "Option 1: Add to .env file:\n"
+                "ANTHROPIC_API_KEY=your_api_key_here\n\n"
+                "Option 2: Enable free SambaNova:\n"
+                "USE_FALLBACK_LLM=true"
+            )
+    async def stream(
+        self,
+        messages: List[Dict],
+        tools: List[Dict] = None,
+        system_prompt: str = None,
+        model: str = None,
+        max_tokens: int = 8192,
+        temperature: float = 0.7
+    ) -> AsyncGenerator[Tuple[str, List[Dict], str], None]:
+        """
+        Stream responses from the LLM with automatic fallback.
+        This is the main interface - it handles all provider selection,
+        retries, and fallback logic internally.
+        Yields: (response_text, tool_calls, provider_used)
+        """
+        # Use default model if not specified
+        if model is None:
+            model = self.config["default_model"]
+        # Track which provider we're using
+        provider_used = self.provider_name
+        # Determine provider order based on preference
+        use_anthropic_first = True
+        if self.config["provider_preference"] == "cost_optimize" and self.fallback_router:
+            # With cost_optimize, prefer SambaNova first
+            use_anthropic_first = False
+        # Apply smart routing if enabled
+        if self.config.get("enable_smart_routing", False) and self.primary_client and self.fallback_router:
+            # Extract the last user message for analysis
+            last_message = ""
+            for msg in reversed(messages):
+                if msg.get("role") == "user":
+                    if isinstance(msg.get("content"), str):
+                        last_message = msg["content"]
+                    elif isinstance(msg.get("content"), list):
+                        # Extract text from content blocks
+                        for block in msg["content"]:
+                            if isinstance(block, dict) and block.get("type") == "text":
+                                last_message = block.get("text", "")
+                                break
+                    break
+            if last_message:
+                # Classify the query
+                query_type = self.classify_query_complexity(
+                    last_message,
+                    len(tools) if tools else 0
+                )
+                # Override provider preference based on classification
+                if query_type == "simple":
+                    if use_anthropic_first:
+                        logger.info(f"Smart routing: Directing simple query to Llama for cost savings: '{last_message[:80]}...'")
+                    use_anthropic_first = False
+                elif query_type == "complex":
+                    if not use_anthropic_first:
+                        logger.info(f"Smart routing: Directing complex query to Claude for better quality: '{last_message[:80]}...'")
+                    use_anthropic_first = True
+        # Try first provider based on preference
+        if use_anthropic_first and self.primary_client:
+            try:
+                async for result in self._stream_anthropic(
+                    messages, tools, system_prompt, model, max_tokens, temperature
+                ):
+                    yield result
+                return  # Success, exit
+            except Exception as e:
+                logger.warning(f"Primary provider failed: {e}")
+                # Fall through to fallback if available
+                if not self.fallback_router:
+                    raise
+        # Try fallback provider
+        if self.fallback_router:
+            if not use_anthropic_first or not self.primary_client:
+                logger.info("Using SambaNova as primary provider (cost_optimize mode)" if not use_anthropic_first else "Using fallback LLM provider")
+            try:
+                # Override provider preference to force SambaNova when smart routing decided to use it
+                effective_preference = "cost_optimize" if not use_anthropic_first else self.config["provider_preference"]
+                async for text, tool_calls, provider in self.fallback_router.stream_with_fallback(
+                    messages=messages,
+                    tools=tools or [],
+                    system_prompt=system_prompt,
+                    model=model,
+                    max_tokens=max_tokens,
+                    provider_preference=effective_preference
+                ):
+                    yield (text, tool_calls, provider)
+                # If we used SambaNova first successfully with cost_optimize, we're done
+                if not use_anthropic_first:
+                    return
+            except Exception as e:
+                if not use_anthropic_first and self.primary_client:
+                    # SambaNova failed in cost_optimize mode, try Anthropic
+                    logger.warning(f"SambaNova failed in cost_optimize mode: {e}, falling back to Anthropic")
+                    try:
+                        async for result in self._stream_anthropic(
+                            messages, tools, system_prompt, model, max_tokens, temperature
+                        ):
+                            yield result
+                        return  # Success, exit
+                    except Exception as anthropic_error:
+                        logger.error(f"All LLM providers failed: SambaNova: {e}, Anthropic: {anthropic_error}")
+                        raise RuntimeError("All LLM providers failed. Please check configuration.")
+                else:
+                    logger.error(f"All LLM providers failed: {e}")
+                    raise RuntimeError("All LLM providers failed. Please check configuration.")
+        else:
+            raise RuntimeError("No LLM providers available")
+    async def _stream_anthropic(
+        self,
+        messages: List[Dict],
+        tools: List[Dict],
+        system_prompt: str,
+        model: str,
+        max_tokens: int,
+        temperature: float
+    ) -> AsyncGenerator[Tuple[str, List[Dict], str], None]:
+        """Stream from Anthropic with retry logic"""
+        retry_delay = 1
+        last_error = None
+        # Skip system message if it's in messages array
+        api_messages = messages[1:] if messages and messages[0].get("role") == "system" else messages
+        # Use system prompt or extract from messages
+        if not system_prompt and messages and messages[0].get("role") == "system":
+            system_prompt = messages[0].get("content", "")
+        for attempt in range(self.config["max_retries"] + 1):
+            try:
+                logger.info(f"Streaming from Anthropic (attempt {attempt + 1})")
+                accumulated_text = ""
+                tool_calls = []
+                # Create the stream
+                stream_params = {
+                    "model": model,
+                    "max_tokens": max_tokens,
+                    "messages": api_messages,
+                    "temperature": temperature
+                }
+                if system_prompt:
+                    stream_params["system"] = system_prompt
+                if tools:
+                    stream_params["tools"] = tools
+                async with self.primary_client.messages.stream(**stream_params) as stream:
+                    async for event in stream:
+                        if event.type == "content_block_start":
+                            if event.content_block.type == "tool_use":
+                                tool_calls.append({
+                                    "id": event.content_block.id,
+                                    "name": event.content_block.name,
+                                    "input": {}
+                                })
+                        elif event.type == "content_block_delta":
+                            if event.delta.type == "text_delta":
+                                accumulated_text += event.delta.text
+                                yield (accumulated_text, tool_calls, "Anthropic Claude")
+                    # Get final message
+                    final_message = await stream.get_final_message()
+                    # Rebuild tool calls from final message
+                    tool_calls.clear()
+                    for block in final_message.content:
+                        if block.type == "tool_use":
+                            tool_calls.append({
+                                "id": block.id,
+                                "name": block.name,
+                                "input": block.input
+                            })
+                        elif block.type == "text" and block.text:
+                            if block.text not in accumulated_text:
+                                accumulated_text += block.text
+                    yield (accumulated_text, tool_calls, "Anthropic Claude")
+                    return  # Success
+            except (httpx.RemoteProtocolError, httpx.ReadError) as e:
+                last_error = e
+                logger.warning(f"Network error on attempt {attempt + 1}: {e}")
+                if attempt < self.config["max_retries"]:
+                    await asyncio.sleep(retry_delay)
+                    retry_delay *= 2
+                else:
+                    raise
+            except Exception as e:
+                logger.error(f"Anthropic streaming error: {e}")
+                raise
+    def get_status(self) -> Dict[str, Any]:
+        """Get current client status and configuration"""
+        return {
+            "primary_provider": "Anthropic" if self.primary_client else None,
+            "fallback_enabled": bool(self.fallback_router),
+            "current_provider": self.provider_name,
+            "provider_preference": self.config["provider_preference"],
+            "max_retries": self.config["max_retries"]
+        }
+    def is_using_llama_primary(self) -> bool:
+        """Check if Llama/SambaNova is the primary provider"""
+        # Check if cost_optimize preference is set and fallback is available
+        if self.config.get("provider_preference") == "cost_optimize" and self.fallback_router:
+            return True
+        # Check if we have no Anthropic client and are using SambaNova
+        if not self.primary_client and self.fallback_router:
+            return True
+        return False
+    def classify_query_complexity(self, message: str, tools_count: int = 0) -> str:
+        """
+        Classify query as 'simple' or 'complex' based on content analysis.
+        Args:
+            message: The user's query text
+            tools_count: Number of tools available for this query
+        Returns:
+            'simple' | 'complex' - The query classification
+        """
+        message_lower = message.lower()
+        # Simple query indicators (good for Llama)
+        simple_patterns = [
+            "what is", "define", "when was", "who is", "list of",
+            "how many", "name the", "what does", "explain what",
+            "is there", "are there", "can you list", "tell me about",
+            "what are the symptoms", "side effects of", "list the",
+            "symptoms of", "treatment for", "causes of"
+        ]
+        # Complex query indicators (better for Claude)
+        complex_patterns = [
+            "analyze", "compare", "evaluate", "synthesize", "comprehensive",
+            "all", "every", "detailed", "mechanism", "pathophysiology",
+            "genotyping", "gene therapy", "combination therapy",
+            "latest research", "recent studies", "cutting-edge",
+            "molecular", "genetic mutation", "therapeutic pipeline",
+            "clinical trial results", "meta-analysis", "systematic review",
+            # Enhanced trial-related patterns
+            "trials", "clinical trials", "studies", "clinical study",
+            "NCT", "recruiting", "enrollment", "study protocol",
+            "phase 1", "phase 2", "phase 3", "phase 4", "early phase",
+            "investigational", "experimental", "novel treatment",
+            "treatment pipeline", "research pipeline", "drug development"
+        ]
+        # Count pattern matches
+        simple_score = sum(1 for pattern in simple_patterns if pattern in message_lower)
+        complex_score = sum(1 for pattern in complex_patterns if pattern in message_lower)
+        # Decision logic
+        if complex_score > 0:
+            # Any complex indicator suggests complex query
+            return "complex"
+        elif simple_score > 0 and len(message) < 150:
+            # Simple pattern and short query
+            return "simple"
+        elif len(message) > 300:
+            # Long queries are likely complex
+            return "complex"
+        elif tools_count > 8:
+            # Many tools suggest complex analysis needed
+            return "complex"
+        else:
+            # Default to complex for safety (better quality)
+            return "complex" if self.primary_client else "simple"
+    def get_provider_display_name(self) -> str:
+        """Get a user-friendly provider status string"""
+        if self.primary_client and self.fallback_router:
+            # Both providers available
+            if self.config["provider_preference"] == "cost_optimize":
+                status = "SambaNova Llama 3.3 70B (primary, cost-optimized) with Anthropic Claude fallback"
+            elif self.config["provider_preference"] == "quality_first":
+                status = "Anthropic Claude (primary, quality-first) with SambaNova fallback"
+            else:  # auto
+                status = "Anthropic Claude (with SambaNova fallback)"
+        elif self.primary_client:
+            status = "Anthropic Claude"
+        elif self.fallback_router:
+            status = f"SambaNova Llama 3.3 70B ({self.config['provider_preference']} mode)"
+        else:
+            status = "Not configured"
+        return status
+    async def cleanup(self):
+        """Clean up resources"""
+        if self.fallback_router:
+            try:
+                await self.fallback_router.cleanup()
+            except:
+                pass
+    async def __aenter__(self):
+        """Async context manager entry"""
+        return self
+    async def __aexit__(self, exc_type, exc_val, exc_tb):
+        """Async context manager exit"""
+        await self.cleanup()
+# Global instance (optional - can be created per request instead)
+_global_client: Optional[UnifiedLLMClient] = None
+def get_llm_client() -> UnifiedLLMClient:
+    """Get or create the global LLM client instance"""
+    global _global_client
+    if _global_client is None:
+        _global_client = UnifiedLLMClient()
+    return _global_client
+async def cleanup_global_client():
+    """Clean up the global client instance"""
+    global _global_client
+    if _global_client:
+        await _global_client.cleanup()
+        _global_client = None

llm_providers.py ADDED Viewed

	@@ -0,0 +1,357 @@

+#!/usr/bin/env python3
+"""
+Multi-LLM provider support with fallback logic.
+Includes SambaNova free tier as primary fallback option.
+"""
+import os
+import json
+import logging
+import httpx
+import asyncio
+from typing import AsyncGenerator, List, Dict, Any, Optional, Tuple
+from anthropic import AsyncAnthropic
+from dotenv import load_dotenv
+load_dotenv()
+logger = logging.getLogger(__name__)
+class SambaNovaProvider:
+    """
+    SambaNova Cloud provider - requires API key for access.
+    Get your API key at https://cloud.sambanova.ai/
+    Includes $5-30 free credits for new accounts.
+    """
+    BASE_URL = "https://api.sambanova.ai/v1"
+    # Available models
+    MODELS = {
+        "llama-3.3-70b": "Meta-Llama-3.3-70B-Instruct",  # Latest and best!
+        "llama-3.1-405b": "Meta-Llama-3.1-405B-Instruct",
+        "llama-3.1-70b": "Meta-Llama-3.1-70B-Instruct",
+        "llama-3.1-8b": "Meta-Llama-3.1-8B-Instruct",
+        "llama-3.2-11b": "Llama-3.2-11B-Vision-Instruct",
+        "llama-3.2-3b": "Llama-3.2-3B-Instruct",
+        "llama-3.2-1b": "Llama-3.2-1B-Instruct"
+    }
+    def __init__(self, api_key: Optional[str] = None):
+        """
+        Initialize SambaNova provider.
+        API key is REQUIRED - get yours at https://cloud.sambanova.ai/
+        """
+        self.api_key = api_key or os.getenv("SAMBANOVA_API_KEY")
+        if not self.api_key:
+            raise ValueError(
+                "SAMBANOVA_API_KEY is required for SambaNova API access.\n"
+                "Get your API key at: https://cloud.sambanova.ai/\n"
+                "Then set it in your .env file: SAMBANOVA_API_KEY=your_key_here"
+            )
+        self.client = httpx.AsyncClient(timeout=60.0)
+    async def stream(
+        self,
+        messages: List[Dict],
+        system: str = None,
+        tools: List[Dict] = None,
+        model: str = "llama-3.1-70b",
+        max_tokens: int = 4096,
+        temperature: float = 0.7
+    ) -> AsyncGenerator[Tuple[str, List[Dict]], None]:
+        """
+        Stream responses from SambaNova API.
+        Compatible interface with Anthropic streaming.
+        """
+        # Select the full model name
+        full_model = self.MODELS.get(model, self.MODELS["llama-3.1-70b"])
+        # Convert messages to OpenAI format (SambaNova uses OpenAI-compatible API)
+        formatted_messages = []
+        # Add system message if provided
+        if system:
+            formatted_messages.append({
+                "role": "system",
+                "content": system
+            })
+        # Convert Anthropic message format to OpenAI format
+        for msg in messages:
+            if msg["role"] == "user":
+                formatted_messages.append({
+                    "role": "user",
+                    "content": msg.get("content", "")
+                })
+            elif msg["role"] == "assistant":
+                # Handle assistant messages with potential tool calls
+                content = msg.get("content", "")
+                if isinstance(content, list):
+                    # Extract text from content blocks
+                    text_parts = []
+                    for block in content:
+                        if block.get("type") == "text":
+                            text_parts.append(block.get("text", ""))
+                    content = "\n".join(text_parts)
+                formatted_messages.append({
+                    "role": "assistant",
+                    "content": content
+                })
+        # Prepare request payload
+        payload = {
+            "model": full_model,
+            "messages": formatted_messages,
+            "max_tokens": max_tokens,
+            "temperature": temperature,
+            "stream": True,
+            "stream_options": {"include_usage": True}
+        }
+        # Add tools if provided (for models that support it)
+        if tools and model in ["llama-3.3-70b", "llama-3.1-405b", "llama-3.1-70b"]:
+            # Convert Anthropic tool format to OpenAI format
+            openai_tools = []
+            for tool in tools:
+                openai_tools.append({
+                    "type": "function",
+                    "function": {
+                        "name": tool["name"],
+                        "description": tool.get("description", ""),
+                        "parameters": tool.get("input_schema", {})
+                    }
+                })
+            payload["tools"] = openai_tools
+            payload["tool_choice"] = "auto"
+        # Headers - API key is always required now
+        headers = {
+            "Content-Type": "application/json",
+            "Authorization": f"Bearer {self.api_key}"
+        }
+        try:
+            # Make streaming request
+            accumulated_text = ""
+            tool_calls = []
+            async with self.client.stream(
+                "POST",
+                f"{self.BASE_URL}/chat/completions",
+                json=payload,
+                headers=headers
+            ) as response:
+                response.raise_for_status()
+                async for line in response.aiter_lines():
+                    if line.startswith("data: "):
+                        data = line[6:]  # Remove "data: " prefix
+                        if data == "[DONE]":
+                            break
+                        try:
+                            chunk = json.loads(data)
+                            # Handle usage-only chunks (sent at end of stream)
+                            if "usage" in chunk and ("choices" not in chunk or len(chunk.get("choices", [])) == 0):
+                                # This is a usage statistics chunk, skip it
+                                logger.debug(f"Received usage chunk: {chunk.get('usage', {})}")
+                                continue
+                            # Extract content from chunk
+                            if "choices" in chunk and len(chunk["choices"]) > 0:
+                                choice = chunk["choices"][0]
+                                delta = choice.get("delta", {})
+                                # Handle text content
+                                if "content" in delta and delta["content"]:
+                                    accumulated_text += delta["content"]
+                                    yield (accumulated_text, tool_calls)
+                                # Handle tool calls (if supported)
+                                if "tool_calls" in delta:
+                                    for tc in delta["tool_calls"]:
+                                        # Convert OpenAI tool call format to Anthropic format
+                                        tool_calls.append({
+                                            "id": tc.get("id", f"tool_{len(tool_calls)}"),
+                                            "name": tc.get("function", {}).get("name", ""),
+                                            "input": json.loads(tc.get("function", {}).get("arguments", "{}"))
+                                        })
+                        except json.JSONDecodeError:
+                            logger.warning(f"Failed to parse SSE data: {data}")
+                            continue
+            # Final yield with complete results
+            yield (accumulated_text, tool_calls)
+        except httpx.HTTPStatusError as e:
+            if e.response.status_code == 410:
+                logger.error("SambaNova API endpoint has been discontinued (410 GONE)")
+                raise RuntimeError(
+                    "SambaNova API endpoint no longer exists. "
+                    "Make sure you have a valid API key set in SAMBANOVA_API_KEY."
+                )
+            elif e.response.status_code == 401:
+                logger.error("SambaNova API authentication failed")
+                raise RuntimeError(
+                    "SambaNova authentication failed. Please check your API key."
+                )
+            else:
+                logger.error(f"SambaNova API error: {e}")
+                raise
+        except httpx.HTTPError as e:
+            logger.error(f"SambaNova API error: {e}")
+            raise
+    async def close(self):
+        """Close the HTTP client"""
+        await self.client.aclose()
+class LLMRouter:
+    """
+    Routes LLM requests to appropriate providers with fallback logic.
+    """
+    def __init__(self):
+        self.providers = {}
+        self._setup_providers()
+    def _setup_providers(self):
+        """Initialize available providers"""
+        # Primary: Anthropic (if API key available)
+        anthropic_key = os.getenv("ANTHROPIC_API_KEY")
+        if anthropic_key:
+            self.providers["anthropic"] = AsyncAnthropic(api_key=anthropic_key)
+            logger.info("Anthropic provider initialized")
+        # Fallback: SambaNova (always available, free!)
+        self.providers["sambanova"] = SambaNovaProvider()
+        logger.info("SambaNova provider initialized (free tier)")
+    async def stream_with_fallback(
+        self,
+        messages: List[Dict],
+        tools: List[Dict],
+        system_prompt: str,
+        model: str = None,
+        max_tokens: int = 4096,
+        provider_preference: str = "auto"
+    ) -> AsyncGenerator[Tuple[str, List[Dict], str], None]:
+        """
+        Stream from LLM with automatic fallback.
+        Returns (text, tool_calls, provider_used) tuples.
+        """
+        # Determine provider order based on preference
+        if provider_preference == "cost_optimize":
+            # Prefer free SambaNova first
+            provider_order = ["sambanova", "anthropic"]
+        elif provider_preference == "quality_first":
+            # Prefer Anthropic first
+            provider_order = ["anthropic", "sambanova"]
+        else:  # auto
+            # Use Anthropic if available, fall back to SambaNova
+            provider_order = ["anthropic", "sambanova"] if "anthropic" in self.providers else ["sambanova"]
+        last_error = None
+        for provider_name in provider_order:
+            if provider_name not in self.providers:
+                continue
+            try:
+                logger.info(f"Attempting to use {provider_name} provider...")
+                if provider_name == "anthropic":
+                    # Use existing Anthropic streaming
+                    provider = self.providers["anthropic"]
+                    # Stream from Anthropic
+                    accumulated_text = ""
+                    tool_calls = []
+                    async with provider.messages.stream(
+                        model=model or "claude-sonnet-4-5-20250929",
+                        max_tokens=max_tokens,
+                        messages=messages,
+                        system=system_prompt,
+                        tools=tools
+                    ) as stream:
+                        async for event in stream:
+                            if event.type == "content_block_start":
+                                if event.content_block.type == "tool_use":
+                                    tool_calls.append({
+                                        "id": event.content_block.id,
+                                        "name": event.content_block.name,
+                                        "input": {}
+                                    })
+                            elif event.type == "content_block_delta":
+                                if event.delta.type == "text_delta":
+                                    accumulated_text += event.delta.text
+                                    yield (accumulated_text, tool_calls, "Anthropic Claude")
+                        # Get final message
+                        final_message = await stream.get_final_message()
+                        # Rebuild tool calls from final message
+                        tool_calls.clear()
+                        for block in final_message.content:
+                            if block.type == "tool_use":
+                                tool_calls.append({
+                                    "id": block.id,
+                                    "name": block.name,
+                                    "input": block.input
+                                })
+                        yield (accumulated_text, tool_calls, "Anthropic Claude")
+                        return  # Success!
+                elif provider_name == "sambanova":
+                    # Use SambaNova streaming
+                    provider = self.providers["sambanova"]
+                    # Determine which Llama model to use
+                    if max_tokens > 8192:
+                        samba_model = "llama-3.1-405b"  # Largest model for complex tasks
+                    else:
+                        # Default to Llama 3.3 70B - newest and best for most tasks
+                        samba_model = "llama-3.3-70b"
+                    async for text, tool_calls in provider.stream(
+                        messages=messages,
+                        system=system_prompt,
+                        tools=tools,
+                        model=samba_model,
+                        max_tokens=max_tokens
+                    ):
+                        yield (text, tool_calls, f"SambaNova {samba_model}")
+                    return  # Success!
+            except Exception as e:
+                logger.warning(f"Provider {provider_name} failed: {e}")
+                last_error = e
+                continue
+        # All providers failed
+        error_msg = f"All LLM providers failed. Last error: {last_error}"
+        logger.error(error_msg)
+        raise Exception(error_msg)
+    async def cleanup(self):
+        """Clean up provider resources"""
+        if "sambanova" in self.providers:
+            await self.providers["sambanova"].close()
+# Global router instance
+llm_router = LLMRouter()

parallel_tool_execution.py ADDED Viewed

	@@ -0,0 +1,235 @@

+#!/usr/bin/env python3
+"""
+Parallel tool execution optimization for ALS Research Agent
+This module replaces sequential tool execution with parallel execution
+to reduce response time by ~60-70% for multi-tool queries.
+"""
+import asyncio
+from typing import List, Dict, Tuple, Any
+import logging
+logger = logging.getLogger(__name__)
+async def execute_single_tool(
+    tool_call: Dict,
+    call_mcp_tool_func,
+    index: int
+) -> Tuple[int, str, Dict]:
+    """
+    Execute a single tool call asynchronously.
+    Returns (index, progress_text, result_dict) to maintain order.
+    """
+    tool_name = tool_call["name"]
+    tool_args = tool_call["input"]
+    # Show search info in progress text
+    tool_display = tool_name.replace('__', ' → ')
+    search_info = ""
+    if "query" in tool_args:
+        search_info = f" `{tool_args['query'][:50]}{'...' if len(tool_args['query']) > 50 else ''}`"
+    elif "condition" in tool_args:
+        search_info = f" `{tool_args['condition'][:50]}{'...' if len(tool_args['condition']) > 50 else ''}`"
+    try:
+        # Call MCP tool
+        start_time = asyncio.get_event_loop().time()
+        tool_result = await call_mcp_tool_func(tool_name, tool_args)
+        elapsed = asyncio.get_event_loop().time() - start_time
+        logger.info(f"Tool {tool_name} completed in {elapsed:.2f}s")
+        # Check for zero results to provide clear indicators
+        has_results = True
+        results_count = 0
+        if isinstance(tool_result, str):
+            result_lower = tool_result.lower()
+            # Check for specific result counts
+            import re
+            count_matches = re.findall(r'found (\d+) (?:papers?|trials?|preprints?|results?)', result_lower)
+            if count_matches:
+                results_count = int(count_matches[0])
+            # Check for no results
+            if any(phrase in result_lower for phrase in [
+                "no results found", "0 results", "no papers found",
+                "no trials found", "no preprints found", "not found",
+                "zero results", "no matches"
+            ]) or results_count == 0:
+                has_results = False
+        # Create clear success/failure indicator
+        if has_results:
+            if results_count > 0:
+                progress_text = f"\n✅ **Found {results_count} results:** {tool_display}{search_info}"
+            else:
+                progress_text = f"\n✅ **Success:** {tool_display}{search_info}"
+        else:
+            progress_text = f"\n⚠️ **No results:** {tool_display}{search_info} - will try alternatives"
+        # Add timing for long operations
+        if elapsed > 5:
+            progress_text += f" (took {elapsed:.1f}s)"
+        # Check for zero results to enable self-correction
+        if not has_results:
+                # Add self-correction hint to the result
+                tool_result += "\n\n**SELF-CORRECTION HINT:** No results found with this query. Consider:\n"
+                tool_result += "1. Broadening search terms (remove qualifiers)\n"
+                tool_result += "2. Using alternative terminology or synonyms\n"
+                tool_result += "3. Searching related concepts\n"
+                tool_result += "4. Checking for typos in search terms"
+        result_dict = {
+            "type": "tool_result",
+            "tool_use_id": tool_call["id"],
+            "content": tool_result
+        }
+        return index, progress_text, result_dict
+    except Exception as e:
+        logger.error(f"Error executing tool {tool_name}: {e}")
+        # Clear failure indicator for errors
+        progress_text = f"\n❌ **Failed:** {tool_display}{search_info} - {str(e)[:50]}"
+        error_result = {
+            "type": "tool_result",
+            "tool_use_id": tool_call["id"],
+            "content": f"Error executing tool: {str(e)}"
+        }
+        return index, progress_text, error_result
+async def execute_tool_calls_parallel(
+    tool_calls: List[Dict],
+    call_mcp_tool_func
+) -> Tuple[str, List[Dict]]:
+    """
+    Execute tool calls in parallel and collect results.
+    Maintains the original order of tool calls in results.
+    Returns: (progress_text, tool_results_content)
+    """
+    if not tool_calls:
+        return "", []
+    # Track execution time for progress reporting
+    start_time = asyncio.get_event_loop().time()
+    # Log parallel execution
+    logger.info(f"Executing {len(tool_calls)} tools in parallel")
+    # Create tasks for parallel execution
+    tasks = [
+        execute_single_tool(tool_call, call_mcp_tool_func, i)
+        for i, tool_call in enumerate(tool_calls)
+    ]
+    # Execute all tasks in parallel
+    results = await asyncio.gather(*tasks, return_exceptions=True)
+    # Sort results by index to maintain original order
+    sorted_results = sorted(
+        [r for r in results if not isinstance(r, Exception)],
+        key=lambda x: x[0]
+    )
+    # Combine results with progress summary
+    completed_count = len(sorted_results)
+    total_count = len(tool_calls)
+    # Create progress summary with timing info
+    elapsed_time = asyncio.get_event_loop().time() - start_time
+    if elapsed_time > 5:
+        timing_info = f" in {elapsed_time:.1f}s"
+    else:
+        timing_info = ""
+    progress_text = f"\n📊 **Search Progress:** Completed {completed_count}/{total_count} searches{timing_info}\n"
+    tool_results_content = []
+    for index, prog_text, result_dict in sorted_results:
+        progress_text += prog_text
+        tool_results_content.append(result_dict)
+    # Handle any exceptions
+    for i, result in enumerate(results):
+        if isinstance(result, Exception):
+            logger.error(f"Task {i} failed with exception: {result}")
+            # Add error result for failed tasks
+            if i < len(tool_calls):
+                tool_results_content.insert(i, {
+                    "type": "tool_result",
+                    "tool_use_id": tool_calls[i]["id"],
+                    "content": f"Tool execution failed: {str(result)}"
+                })
+    return progress_text, tool_results_content
+# Backward compatibility wrapper
+async def execute_tool_calls_optimized(
+    tool_calls: List[Dict],
+    call_mcp_tool_func,
+    parallel: bool = True
+) -> Tuple[str, List[Dict]]:
+    """
+    Execute tool calls with optional parallel execution.
+    Args:
+        tool_calls: List of tool calls to execute
+        call_mcp_tool_func: Function to call MCP tools
+        parallel: If True, execute tools in parallel; if False, execute sequentially
+    Returns: (progress_text, tool_results_content)
+    """
+    if parallel and len(tool_calls) > 1:
+        # Use parallel execution for multiple tools
+        return await execute_tool_calls_parallel(tool_calls, call_mcp_tool_func)
+    else:
+        # Fall back to sequential execution (import from original)
+        from refactored_helpers import execute_tool_calls
+        return await execute_tool_calls(tool_calls, call_mcp_tool_func)
+def estimate_time_savings(num_tools: int, avg_tool_time: float = 3.5) -> Dict[str, float]:
+    """
+    Estimate time savings from parallel execution.
+    Args:
+        num_tools: Number of tools to execute
+        avg_tool_time: Average time per tool in seconds
+    Returns: Dictionary with timing estimates
+    """
+    sequential_time = num_tools * avg_tool_time
+    # Parallel time is roughly the time of the slowest tool plus overhead
+    parallel_time = avg_tool_time + 0.5  # 0.5s overhead for coordination
+    savings = sequential_time - parallel_time
+    savings_percent = (savings / sequential_time) * 100 if sequential_time > 0 else 0
+    return {
+        "sequential_time": sequential_time,
+        "parallel_time": parallel_time,
+        "time_saved": savings,
+        "savings_percent": savings_percent
+    }
+# Test the optimization
+if __name__ == "__main__":
+    # Test time savings estimation
+    for n in [2, 3, 4, 5]:
+        estimates = estimate_time_savings(n)
+        print(f"\n{n} tools:")
+        print(f"  Sequential: {estimates['sequential_time']:.1f}s")
+        print(f"  Parallel: {estimates['parallel_time']:.1f}s")
+        print(f"  Savings: {estimates['time_saved']:.1f}s ({estimates['savings_percent']:.0f}%)")

query_classifier.py ADDED Viewed

	@@ -0,0 +1,202 @@

+#!/usr/bin/env python3
+"""
+Query Classification Module for ALS Research Agent
+Determines whether a query requires full research workflow or simple response
+"""
+import re
+from typing import Dict, Tuple, List
+import logging
+logger = logging.getLogger(__name__)
+class QueryClassifier:
+    """Classify queries as research-required or simple questions"""
+    # Keywords that indicate ALS research is needed
+    RESEARCH_KEYWORDS = [
+        # Disease-specific terms
+        'als', 'amyotrophic lateral sclerosis', 'motor neuron disease',
+        'mnd', 'lou gehrig', 'ftd', 'frontotemporal dementia',
+        # Medical research terms
+        'clinical trial', 'treatment', 'therapy', 'drug', 'medication',
+        'gene therapy', 'stem cell', 'biomarker', 'diagnosis',
+        'prognosis', 'survival', 'progression', 'symptom',
+        'cure', 'breakthrough', 'research', 'study', 'paper',
+        'latest', 'recent', 'new findings', 'advances',
+        # Specific ALS-related
+        'riluzole', 'edaravone', 'radicava', 'relyvrio', 'qalsody',
+        'tofersen', 'sod1', 'c9orf72', 'tdp-43', 'fus',
+        # Research actions
+        'find studies', 'search papers', 'what research',
+        'clinical evidence', 'scientific literature'
+    ]
+    # Keywords that indicate simple/general questions
+    SIMPLE_KEYWORDS = [
+        'hello', 'hi', 'hey', 'thanks', 'thank you',
+        'how are you', "what's your name", 'who are you',
+        'what can you do', 'help', 'test', 'testing',
+        'explain', 'define', 'what is', 'what are',
+        'how does', 'why', 'when', 'where', 'who'
+    ]
+    # Exclusion patterns for non-research queries
+    NON_RESEARCH_PATTERNS = [
+        r'^(hi|hello|hey|thanks|thank you)',
+        r'^test\s',
+        r'^how (are you|do you)',
+        r'^what (is|are) (the|a|an)\s+\w+$',  # Simple definitions
+        r'^(explain|define)\s+\w+$',  # Simple explanations
+        r'^\w{1,3}$',  # Very short queries
+    ]
+    @classmethod
+    def classify_query(cls, query: str) -> Dict[str, any]:
+        """
+        Classify a query and determine processing strategy.
+        Returns:
+            Dict with:
+                - requires_research: bool - Whether to use full research workflow
+                - confidence: float - Confidence in classification (0-1)
+                - reason: str - Explanation of classification
+                - suggested_mode: str - 'research' or 'simple'
+        """
+        query_lower = query.lower().strip()
+        # Check for very short or empty queries
+        if len(query_lower) < 5:
+            return {
+                'requires_research': False,
+                'confidence': 0.9,
+                'reason': 'Query too short for research',
+                'suggested_mode': 'simple'
+            }
+        # Check exclusion patterns first
+        for pattern in cls.NON_RESEARCH_PATTERNS:
+            if re.match(pattern, query_lower):
+                return {
+                    'requires_research': False,
+                    'confidence': 0.85,
+                    'reason': 'Matches non-research pattern',
+                    'suggested_mode': 'simple'
+                }
+        # Count research keywords
+        research_score = sum(
+            1 for keyword in cls.RESEARCH_KEYWORDS
+            if keyword in query_lower
+        )
+        # Count simple keywords
+        simple_score = sum(
+            1 for keyword in cls.SIMPLE_KEYWORDS
+            if keyword in query_lower
+        )
+        # Check for question complexity
+        has_multiple_questions = query.count('?') > 1
+        has_complex_structure = len(query.split()) > 15
+        mentions_comparison = any(word in query_lower for word in
+                                ['compare', 'versus', 'vs', 'difference between'])
+        # Decision logic - Conservative approach for ALS research agent
+        # FIRST: Check if this is truly just a greeting/thanks (only these skip research)
+        greeting_only = query_lower in ['hi', 'hello', 'hey', 'thanks', 'thank you', 'bye', 'goodbye', 'test']
+        if greeting_only and research_score == 0:
+            return {
+                'requires_research': False,
+                'confidence': 0.95,
+                'reason': 'Pure greeting or acknowledgment',
+                'suggested_mode': 'simple'
+            }
+        # SECOND: If ANY research keyword is present, use research mode
+        # This includes "ALS", "treatment", "therapy", etc.
+        if research_score >= 1:
+            return {
+                'requires_research': True,
+                'confidence': min(0.95, 0.7 + research_score * 0.1),
+                'reason': f'Contains research-related terms ({research_score} keywords)',
+                'suggested_mode': 'research'
+            }
+        # THIRD: Check for questions about the agent itself
+        about_agent = any(phrase in query_lower for phrase in [
+            'who are you', 'what can you do', 'how do you work',
+            'what are you', 'your capabilities'
+        ])
+        if about_agent:
+            return {
+                'requires_research': False,
+                'confidence': 0.85,
+                'reason': 'Question about the agent itself',
+                'suggested_mode': 'simple'
+            }
+        # DEFAULT: For an ALS research agent, when in doubt, use research mode
+        # This is safer than potentially missing important medical queries
+        return {
+            'requires_research': True,
+            'confidence': 0.6,
+            'reason': 'Default to research mode for potential medical queries',
+            'suggested_mode': 'research'
+        }
+    @classmethod
+    def should_use_tools(cls, query: str) -> bool:
+        """Quick check if query needs research tools"""
+        classification = cls.classify_query(query)
+        return classification['requires_research'] and classification['confidence'] > 0.65
+    @classmethod
+    def get_processing_hint(cls, classification: Dict) -> str:
+        """Get a hint for how to process the query"""
+        if classification['requires_research']:
+            return "🔬 Using full research workflow ..."
+        else:
+            return "💬 Providing direct response without research tools"
+def test_classifier():
+    """Test the classifier with example queries"""
+    test_queries = [
+        # Should require research
+        "What are the latest gene therapy trials for ALS?",
+        "Compare riluzole and edaravone effectiveness",
+        "Find recent studies on SOD1 mutations",
+        "What breakthroughs in ALS treatment happened in 2024?",
+        "Are there any promising stem cell therapies for motor neuron disease?",
+        # Should NOT require research
+        "Hello, how are you?",
+        "What is your name?",
+        "Test",
+        "Thanks for your help",
+        "Explain what a database is",
+        "What time is it?",
+        "How do I use this app?",
+    ]
+    print("Query Classification Test Results")
+    print("=" * 60)
+    for query in test_queries:
+        result = QueryClassifier.classify_query(query)
+        print(f"\nQuery: {query[:50]}...")
+        print(f"Requires Research: {result['requires_research']}")
+        print(f"Confidence: {result['confidence']:.2f}")
+        print(f"Reason: {result['reason']}")
+        print(f"Mode: {result['suggested_mode']}")
+        print("-" * 40)
+if __name__ == "__main__":
+    test_classifier()

refactored_helpers.py ADDED Viewed

	@@ -0,0 +1,200 @@

+#!/usr/bin/env python3
+"""
+Helper functions to consolidate duplicate code in als_agent_app.py
+Refactored to improve efficiency and reduce redundancy
+"""
+import asyncio
+import httpx
+import logging
+import os
+from typing import AsyncGenerator, List, Dict, Any, Optional, Tuple
+from llm_client import UnifiedLLMClient
+logger = logging.getLogger(__name__)
+async def stream_with_retry(
+    client,
+    messages: List[Dict],
+    tools: List[Dict],
+    system_prompt: str,
+    max_retries: int = 2,
+    model: str = None,
+    max_tokens: int = 8192,
+    stream_name: str = "API call",
+    temperature: float = 0.7
+) -> AsyncGenerator[Tuple[str, List[Dict], str], None]:
+    """
+    Simplified wrapper that delegates to UnifiedLLMClient.
+    The client parameter can be:
+    - An Anthropic client (for backward compatibility)
+    - A UnifiedLLMClient instance
+    - None (will create a UnifiedLLMClient)
+    Yields: (response_text, tool_calls, provider_used) tuples
+    """
+    # If client is None or is an Anthropic client, use UnifiedLLMClient
+    if client is None or not hasattr(client, 'stream'):
+        # Create or get a UnifiedLLMClient instance
+        llm_client = UnifiedLLMClient()
+        logger.info(f"Using {llm_client.get_provider_display_name()} for {stream_name}")
+        try:
+            # Use the unified client's stream method
+            async for text, tool_calls, provider in llm_client.stream(
+                messages=messages,
+                tools=tools,
+                system_prompt=system_prompt,
+                model=model,
+                max_tokens=max_tokens,
+                temperature=temperature
+            ):
+                yield (text, tool_calls, provider)
+        finally:
+            # Clean up if we created the client
+            await llm_client.cleanup()
+    else:
+        # Client is already a UnifiedLLMClient
+        logger.info(f"Using provided {client.get_provider_display_name()} for {stream_name}")
+        async for text, tool_calls, provider in client.stream(
+            messages=messages,
+            tools=tools,
+            system_prompt=system_prompt,
+            model=model,
+            max_tokens=max_tokens,
+            temperature=temperature
+        ):
+            yield (text, tool_calls, provider)
+async def execute_tool_calls(
+    tool_calls: List[Dict],
+    call_mcp_tool_func
+) -> Tuple[str, List[Dict]]:
+    """
+    Execute tool calls and collect results.
+    Consolidates duplicate tool execution logic.
+    Now includes self-correction hints for zero results.
+    Returns: (progress_text, tool_results_content)
+    """
+    progress_text = ""
+    tool_results_content = []
+    zero_result_tools = []
+    for tool_call in tool_calls:
+        tool_name = tool_call["name"]
+        tool_args = tool_call["input"]
+        # Single, clean execution marker with search info
+        tool_display = tool_name.replace('__', ' → ')
+        # Show key search parameters
+        search_info = ""
+        if "query" in tool_args:
+            search_info = f" `{tool_args['query'][:50]}{'...' if len(tool_args['query']) > 50 else ''}`"
+        elif "condition" in tool_args:
+            search_info = f" `{tool_args['condition'][:50]}{'...' if len(tool_args['condition']) > 50 else ''}`"
+        progress_text += f"\n🔧 **Searching:** {tool_display}{search_info}\n"
+        # Call MCP tool
+        tool_result = await call_mcp_tool_func(tool_name, tool_args)
+        # Check for zero results to enable self-correction
+        if isinstance(tool_result, str):
+            result_lower = tool_result.lower()
+            if any(phrase in result_lower for phrase in [
+                "no results found", "0 results", "no papers found",
+                "no trials found", "no preprints found", "not found",
+                "zero results", "no matches"
+            ]):
+                zero_result_tools.append((tool_name, tool_args))
+                # Add self-correction hint to the result
+                tool_result += "\n\n**SELF-CORRECTION HINT:** No results found with this query. Consider:\n"
+                tool_result += "1. Broadening search terms (remove qualifiers)\n"
+                tool_result += "2. Using alternative terminology or synonyms\n"
+                tool_result += "3. Searching related concepts\n"
+                tool_result += "4. Checking for typos in search terms"
+        # Add to results array
+        tool_results_content.append({
+            "type": "tool_result",
+            "tool_use_id": tool_call["id"],
+            "content": tool_result
+        })
+    return progress_text, tool_results_content
+def build_assistant_message(
+    text_content: str,
+    tool_calls: List[Dict],
+    strip_markers: List[str] = None
+) -> List[Dict]:
+    """
+    Build assistant message content with text and tool uses.
+    Consolidates duplicate message building logic.
+    Args:
+        text_content: Text content to include
+        tool_calls: List of tool calls to include
+        strip_markers: Optional list of text markers to strip from content
+    Returns: List of content blocks for assistant message
+    """
+    assistant_content = []
+    # Process text content
+    if text_content and text_content.strip():
+        processed_text = text_content
+        # Strip any specified markers
+        if strip_markers:
+            for marker in strip_markers:
+                processed_text = processed_text.replace(marker, "")
+        processed_text = processed_text.strip()
+        if processed_text:
+            assistant_content.append({
+                "type": "text",
+                "text": processed_text
+            })
+    # Add tool uses
+    for tc in tool_calls:
+        assistant_content.append({
+            "type": "tool_use",
+            "id": tc["id"],
+            "name": tc["name"],
+            "input": tc["input"]
+        })
+    return assistant_content
+def should_continue_iterations(
+    iteration_count: int,
+    max_iterations: int,
+    tool_calls: List[Dict]
+) -> bool:
+    """
+    Check if tool iterations should continue.
+    Centralizes iteration control logic.
+    """
+    if not tool_calls:
+        return False
+    if iteration_count >= max_iterations:
+        logger.warning(f"Reached maximum tool iterations ({max_iterations})")
+        return False
+    return True

requirements.txt ADDED Viewed

	@@ -0,0 +1,37 @@

+# Core Dependencies
+# REQUIRES PYTHON 3.10+ (Recommended: Python 3.12)
+# Compatible with Gradio 5.x and 6.x
+gradio>=6.0.0
+anthropic>=0.34.0
+mcp>=1.21.2  # FastMCP API required
+httpx>=0.25.0
+# Web Scraping
+beautifulsoup4>=4.12.0
+lxml>=4.9.0
+# Database Access (for AACT clinical trials database)
+psycopg2-binary>=2.9.0
+asyncpg>=0.29.0  # For async PostgreSQL with connection pooling
+# Configuration
+python-dotenv>=1.0.0
+# Testing
+pytest>=7.4.0
+pytest-asyncio>=0.21.0
+pytest-cov>=4.1.0
+pytest-mock>=3.12.0
+# RAG and Research Memory (LlamaIndex)
+llama-index-core>=0.11.0
+llama-index-vector-stores-chroma>=0.2.0
+llama-index-embeddings-huggingface>=0.3.0
+chromadb>=0.5.0
+sentence-transformers>=3.0.0
+transformers>=4.30.0
+# Development
+black>=23.0.0
+flake8>=6.1.0
+mypy>=1.7.0

servers/aact_server.py ADDED Viewed

	@@ -0,0 +1,472 @@

+#!/usr/bin/env python3
+"""
+AACT Database MCP Server with Connection Pooling
+Provides access to ClinicalTrials.gov data through the AACT PostgreSQL database.
+AACT (Aggregate Analysis of ClinicalTrials.gov) is maintained by Duke University
+and FDA, providing complete ClinicalTrials.gov data updated daily.
+Database access: aact-db.ctti-clinicaltrials.org
+"""
+import os
+import sys
+import json
+import logging
+import asyncio
+from typing import Optional, Dict, Any, List
+from datetime import datetime, timedelta
+from pathlib import Path
+# Add parent directory to path for shared imports
+sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
+from shared.config import config
+# Load .env file if it exists
+try:
+    from dotenv import load_dotenv
+    env_path = Path(__file__).parent.parent / '.env'
+    if env_path.exists():
+        load_dotenv(env_path)
+        logging.info(f"Loaded .env file from {env_path}")
+except ImportError:
+    pass
+# Try both asyncpg (preferred) and psycopg2 (fallback)
+try:
+    import asyncpg
+    ASYNCPG_AVAILABLE = True
+except ImportError:
+    ASYNCPG_AVAILABLE = False
+try:
+    import psycopg2
+    from psycopg2.extras import RealDictCursor
+    POSTGRES_AVAILABLE = True
+except ImportError:
+    POSTGRES_AVAILABLE = False
+from mcp.server.fastmcp import FastMCP
+# Setup logging
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+# Initialize MCP server
+mcp = FastMCP("aact-database")
+# Database configuration
+AACT_HOST = os.getenv("AACT_HOST", "aact-db.ctti-clinicaltrials.org")
+AACT_PORT = os.getenv("AACT_PORT", "5432")
+AACT_DB = os.getenv("AACT_DB", "aact")
+AACT_USER = os.getenv("AACT_USER", "aact")
+AACT_PASSWORD = os.getenv("AACT_PASSWORD", "")
+# Global connection pool (initialized once)
+_connection_pool: Optional[asyncpg.Pool] = None
+async def get_connection_pool() -> asyncpg.Pool:
+    """Get or create the global connection pool"""
+    global _connection_pool
+    if _connection_pool is None or _connection_pool._closed:
+        logger.info("Creating new database connection pool...")
+        # Build connection URL
+        if AACT_PASSWORD:
+            dsn = f"postgresql://{AACT_USER}:{AACT_PASSWORD}@{AACT_HOST}:{AACT_PORT}/{AACT_DB}"
+        else:
+            dsn = f"postgresql://{AACT_USER}@{AACT_HOST}:{AACT_PORT}/{AACT_DB}"
+        try:
+            _connection_pool = await asyncpg.create_pool(
+                dsn=dsn,
+                min_size=2,    # Minimum connections in pool
+                max_size=10,   # Maximum connections in pool
+                max_queries=50000,  # Max queries per connection before recycling
+                max_inactive_connection_lifetime=300,  # Close idle connections after 5 min
+                command_timeout=60.0,  # Query timeout
+                statement_cache_size=20,  # Cache prepared statements
+            )
+            logger.info("Connection pool created successfully")
+        except Exception as e:
+            logger.error(f"Failed to create connection pool: {e}")
+            raise
+    return _connection_pool
+async def execute_query_pooled(query: str, params: tuple = ()) -> List[Dict]:
+    """Execute query using connection pool (asyncpg)"""
+    pool = await get_connection_pool()
+    async with pool.acquire() as conn:
+        # Convert rows to dicts
+        rows = await conn.fetch(query, *params)
+        return [dict(row) for row in rows]
+def execute_query_sync(query: str, params: tuple = ()) -> List[Dict]:
+    """Fallback: Execute query synchronously (psycopg2)"""
+    conn = None
+    cursor = None
+    try:
+        # Build connection string
+        conn_params = {
+            "host": AACT_HOST,
+            "port": AACT_PORT,
+            "dbname": AACT_DB,
+            "user": AACT_USER,
+        }
+        if AACT_PASSWORD:
+            conn_params["password"] = AACT_PASSWORD
+        conn = psycopg2.connect(**conn_params)
+        cursor = conn.cursor(cursor_factory=RealDictCursor)
+        cursor.execute(query, params)
+        results = cursor.fetchall()
+        return [dict(row) for row in results]
+    except Exception as e:
+        logger.error(f"Database query failed: {e}")
+        raise
+    finally:
+        if cursor:
+            cursor.close()
+        if conn:
+            conn.close()
+async def execute_query(query: str, params: tuple = ()) -> List[Dict]:
+    """Execute query using best available method"""
+    if ASYNCPG_AVAILABLE:
+        # Prefer asyncpg with connection pooling
+        return await execute_query_pooled(query, params)
+    elif POSTGRES_AVAILABLE:
+        # Fallback to synchronous psycopg2
+        return await asyncio.to_thread(execute_query_sync, query, params)
+    else:
+        raise RuntimeError("No PostgreSQL driver available (install asyncpg or psycopg2)")
+@mcp.tool()
+async def search_als_trials(
+    status: Optional[str] = "RECRUITING",
+    phase: Optional[str] = None,
+    intervention: Optional[str] = None,
+    location: Optional[str] = None,
+    max_results: int = 20
+) -> str:
+    """Search for ALS clinical trials in the AACT database.
+    Args:
+        status: Trial status (RECRUITING, ENROLLING_BY_INVITATION, ACTIVE_NOT_RECRUITING, COMPLETED)
+        phase: Trial phase (PHASE_1, PHASE_2, PHASE_3, PHASE_4, EARLY_PHASE_1)
+        intervention: Type of intervention to search for
+        location: Country or region
+        max_results: Maximum number of results to return
+    """
+    if not (ASYNCPG_AVAILABLE or POSTGRES_AVAILABLE):
+        return json.dumps({
+            "error": "Database not available",
+            "message": "PostgreSQL driver not installed. Install asyncpg or psycopg2-binary."
+        })
+    logger.info(f"🔎 AACT Search: status={status}, phase={phase}, intervention={intervention}, location={location}")
+    try:
+        # Build the query with proper filters
+        base_query = """
+        SELECT DISTINCT
+            s.nct_id,
+            s.brief_title,
+            s.overall_status,
+            s.phase,
+            s.enrollment,
+            s.start_date,
+            s.completion_date,
+            s.study_type,
+            s.official_title,
+            d.name as sponsor,
+            STRING_AGG(DISTINCT i.name, ', ') as interventions,
+            STRING_AGG(DISTINCT c.name, ', ') as conditions,
+            COUNT(DISTINCT f.id) as num_locations
+        FROM studies s
+        LEFT JOIN sponsors sp ON s.nct_id = sp.nct_id AND sp.lead_or_collaborator = 'lead'
+        LEFT JOIN responsible_parties d ON sp.nct_id = d.nct_id
+        LEFT JOIN interventions i ON s.nct_id = i.nct_id
+        LEFT JOIN conditions c ON s.nct_id = c.nct_id
+        LEFT JOIN facilities f ON s.nct_id = f.nct_id
+        WHERE (
+            LOWER(c.name) LIKE '%amyotrophic lateral sclerosis%' OR
+            LOWER(c.name) LIKE '%als %' OR
+            LOWER(c.name) LIKE '% als' OR
+            LOWER(c.name) LIKE '%motor neuron disease%' OR
+            LOWER(c.name) LIKE '%lou gehrig%' OR
+            LOWER(s.brief_title) LIKE '%amyotrophic lateral sclerosis%' OR
+            LOWER(s.brief_title) LIKE '%als %' OR
+            LOWER(s.brief_title) LIKE '% als' OR
+            LOWER(s.official_title) LIKE '%amyotrophic lateral sclerosis%' OR
+            LOWER(s.official_title) LIKE '%als %' OR
+            LOWER(s.official_title) LIKE '% als'
+        )
+        """
+        # Apply filters
+        conditions = []
+        params = []
+        param_count = 1
+        if status:
+            conditions.append(f"UPPER(s.overall_status) = ${param_count}")
+            params.append(status.upper())
+            param_count += 1
+        if phase:
+            phase_map = {
+                'PHASE_1': 'Phase 1',
+                'PHASE_2': 'Phase 2',
+                'PHASE_3': 'Phase 3',
+                'PHASE_4': 'Phase 4',
+                'EARLY_PHASE_1': 'Early Phase 1'
+            }
+            mapped_phase = phase_map.get(phase.upper(), phase)
+            conditions.append(f"s.phase = ${param_count}")
+            params.append(mapped_phase)
+            param_count += 1
+        if intervention:
+            conditions.append(f"LOWER(i.name) LIKE ${param_count}")
+            params.append(f"%{intervention.lower()}%")
+            param_count += 1
+        if location:
+            base_query = base_query.replace("LEFT JOIN facilities f", "INNER JOIN facilities f")
+            conditions.append(f"(LOWER(f.country) LIKE ${param_count} OR LOWER(f.state) LIKE ${param_count})")
+            params.append(f"%{location.lower()}%")
+            param_count += 1
+        # Add conditions to query
+        if conditions:
+            base_query += " AND " + " AND ".join(conditions)
+        # Add GROUP BY and ORDER BY
+        base_query += """
+        GROUP BY s.nct_id, s.brief_title, s.overall_status, s.phase,
+                 s.enrollment, s.start_date, s.completion_date,
+                 s.study_type, s.official_title, d.name
+        ORDER BY
+            CASE s.overall_status
+                WHEN 'Recruiting' THEN 1
+                WHEN 'Enrolling by invitation' THEN 2
+                WHEN 'Active, not recruiting' THEN 3
+                WHEN 'Not yet recruiting' THEN 4
+                ELSE 5
+            END,
+            s.start_date DESC NULLS LAST
+        LIMIT ${param_count}
+        """
+        params.append(max_results)
+        # Execute query
+        logger.debug(f"📊 Executing query with {len(params)} parameters")
+        results = await execute_query(base_query, tuple(params))
+        logger.info(f"✅ AACT Results: Found {len(results) if results else 0} trials")
+        if not results:
+            return json.dumps({
+                "message": "No ALS trials found matching your criteria",
+                "total": 0,
+                "trials": []
+            })
+        # Format results
+        trials = []
+        for row in results:
+            trial = {
+                "nct_id": row['nct_id'],
+                "title": row['brief_title'],
+                "status": row['overall_status'],
+                "phase": row['phase'],
+                "enrollment": row['enrollment'],
+                "sponsor": row['sponsor'],
+                "interventions": row['interventions'],
+                "conditions": row['conditions'],
+                "locations_count": row['num_locations'],
+                "start_date": str(row['start_date']) if row['start_date'] else None,
+                "completion_date": str(row['completion_date']) if row['completion_date'] else None,
+                "url": f"https://clinicaltrials.gov/study/{row['nct_id']}"
+            }
+            trials.append(trial)
+        return json.dumps({
+            "message": f"Found {len(trials)} ALS clinical trials",
+            "total": len(trials),
+            "trials": trials
+        }, indent=2)
+    except Exception as e:
+        logger.error(f"❌ AACT Database query failed: {e}")
+        logger.error(f"   Query type: search_als_trials")
+        logger.error(f"   Parameters: status={status}, phase={phase}, intervention={intervention}")
+        return json.dumps({
+            "error": "Database query failed",
+            "message": str(e)
+        })
+@mcp.tool()
+async def get_trial_details(nct_id: str) -> str:
+    """Get detailed information about a specific clinical trial.
+    Args:
+        nct_id: The NCT ID of the trial (e.g., 'NCT04856982')
+    """
+    if not (ASYNCPG_AVAILABLE or POSTGRES_AVAILABLE):
+        return json.dumps({
+            "error": "Database not available",
+            "message": "PostgreSQL driver not installed."
+        })
+    try:
+        # Main trial information
+        main_query = """
+        SELECT
+            s.nct_id,
+            s.brief_title,
+            s.official_title,
+            s.overall_status,
+            s.phase,
+            s.study_type,
+            s.enrollment,
+            s.start_date,
+            s.primary_completion_date,
+            s.completion_date,
+            s.first_posted_date,
+            s.last_update_posted_date,
+            s.why_stopped,
+            b.description as brief_summary,
+            dd.description as detailed_description,
+            e.criteria as eligibility_criteria,
+            e.gender,
+            e.minimum_age,
+            e.maximum_age,
+            e.healthy_volunteers,
+            rp.name as sponsor,
+            rp.responsible_party_type
+        FROM studies s
+        LEFT JOIN brief_summaries b ON s.nct_id = b.nct_id
+        LEFT JOIN detailed_descriptions dd ON s.nct_id = dd.nct_id
+        LEFT JOIN eligibilities e ON s.nct_id = e.nct_id
+        LEFT JOIN responsible_parties rp ON s.nct_id = rp.nct_id
+        WHERE s.nct_id = $1
+        """
+        results = await execute_query(main_query, (nct_id,))
+        if not results:
+            return json.dumps({
+                "error": "Trial not found",
+                "message": f"No trial found with NCT ID: {nct_id}"
+            })
+        trial_info = results[0]
+        # Get outcomes
+        outcomes_query = """
+        SELECT outcome_type, measure, time_frame, description
+        FROM outcomes
+        WHERE nct_id = $1
+        ORDER BY outcome_type, id
+        LIMIT 20
+        """
+        outcomes = await execute_query(outcomes_query, (nct_id,))
+        # Get interventions
+        interventions_query = """
+        SELECT intervention_type, name, description
+        FROM interventions
+        WHERE nct_id = $1
+        """
+        interventions = await execute_query(interventions_query, (nct_id,))
+        # Get locations
+        locations_query = """
+        SELECT name, city, state, country, status
+        FROM facilities
+        WHERE nct_id = $1
+        LIMIT 50
+        """
+        locations = await execute_query(locations_query, (nct_id,))
+        # Format the response
+        return json.dumps({
+            "nct_id": trial_info['nct_id'],
+            "title": trial_info['brief_title'],
+            "official_title": trial_info['official_title'],
+            "status": trial_info['overall_status'],
+            "phase": trial_info['phase'],
+            "study_type": trial_info['study_type'],
+            "enrollment": trial_info['enrollment'],
+            "sponsor": trial_info['sponsor'],
+            "dates": {
+                "start": str(trial_info['start_date']) if trial_info['start_date'] else None,
+                "primary_completion": str(trial_info['primary_completion_date']) if trial_info['primary_completion_date'] else None,
+                "completion": str(trial_info['completion_date']) if trial_info['completion_date'] else None,
+                "first_posted": str(trial_info['first_posted_date']) if trial_info['first_posted_date'] else None,
+                "last_updated": str(trial_info['last_update_posted_date']) if trial_info['last_update_posted_date'] else None
+            },
+            "summary": trial_info['brief_summary'],
+            "detailed_description": trial_info['detailed_description'],
+            "eligibility": {
+                "criteria": trial_info['eligibility_criteria'],
+                "gender": trial_info['gender'],
+                "age_range": f"{trial_info['minimum_age'] or 'N/A'} - {trial_info['maximum_age'] or 'N/A'}",
+                "healthy_volunteers": trial_info['healthy_volunteers']
+            },
+            "outcomes": [
+                {
+                    "type": o['outcome_type'],
+                    "measure": o['measure'],
+                    "time_frame": o['time_frame'],
+                    "description": o['description']
+                } for o in outcomes
+            ],
+            "interventions": [
+                {
+                    "type": i['intervention_type'],
+                    "name": i['name'],
+                    "description": i['description']
+                } for i in interventions
+            ],
+            "locations": [
+                {
+                    "name": l['name'],
+                    "city": l['city'],
+                    "state": l['state'],
+                    "country": l['country'],
+                    "status": l['status']
+                } for l in locations
+            ],
+            "url": f"https://clinicaltrials.gov/study/{nct_id}"
+        }, indent=2)
+    except Exception as e:
+        logger.error(f"Failed to get trial details: {e}")
+        return json.dumps({
+            "error": "Database query failed",
+            "message": str(e)
+        })
+# Cleanup on shutdown
+# Note: FastMCP doesn't have a built-in shutdown handler
+# The connection pool will be closed when the process ends
+# async def cleanup():
+#     """Close the connection pool on shutdown"""
+#     global _connection_pool
+#     if _connection_pool:
+#         await _connection_pool.close()
+#         logger.info("Connection pool closed")
+if __name__ == "__main__":
+    mcp.run()

servers/biorxiv_server.py ADDED Viewed

	@@ -0,0 +1,440 @@

+# biorxiv_server_fixed.py
+from mcp.server.fastmcp import FastMCP
+import httpx
+import logging
+from datetime import datetime, timedelta
+import sys
+from pathlib import Path
+import re
+# Add parent directory to path for shared imports
+sys.path.insert(0, str(Path(__file__).parent.parent))
+from shared import (
+    config,
+    RateLimiter,
+    format_authors,
+    ErrorFormatter,
+    truncate_text
+)
+from shared.http_client import get_http_client, CustomHTTPClient
+# Configure logging with DEBUG for detailed troubleshooting
+logging.basicConfig(
+    level=logging.DEBUG,
+    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
+)
+logger = logging.getLogger(__name__)
+mcp = FastMCP("biorxiv-server")
+# Rate limiting using shared utility
+rate_limiter = RateLimiter(config.rate_limits.biorxiv_delay)
+def preprocess_query(query: str) -> tuple[list[str], list[str]]:
+    """Preprocess query into search terms and handle synonyms.
+    Returns:
+        tuple of (primary_terms, all_search_terms)
+    """
+    # Convert to lowercase for matching
+    query_lower = query.lower()
+    # Common ALS-related synonyms and variations
+    synonyms = {
+        'als': ['amyotrophic lateral sclerosis', 'motor neuron disease', 'motor neurone disease', 'lou gehrig'],
+        'amyotrophic lateral sclerosis': ['als', 'motor neuron disease'],
+        'mnd': ['motor neuron disease', 'motor neurone disease', 'als'],
+        'sod1': ['superoxide dismutase 1', 'cu/zn superoxide dismutase'],
+        'tdp-43': ['tdp43', 'tardbp', 'tar dna binding protein'],
+        'c9orf72': ['c9', 'chromosome 9 open reading frame 72'],
+        'fus': ['fused in sarcoma', 'tls'],
+    }
+    # Split query into individual terms (handle multiple spaces and special chars)
+    # Keep hyphenated words together (like TDP-43)
+    terms = re.split(r'\s+', query_lower.strip())
+    # Build comprehensive search term list
+    all_terms = []
+    primary_terms = []
+    for term in terms:
+        # Skip very short terms unless they're known abbreviations
+        if len(term) < 3 and term not in ['als', 'mnd', 'fus', 'c9']:
+            continue
+        primary_terms.append(term)
+        all_terms.append(term)
+        # Add synonyms if they exist
+        if term in synonyms:
+            all_terms.extend(synonyms[term])
+    # Remove duplicates while preserving order
+    seen = set()
+    all_terms = [t for t in all_terms if not (t in seen or seen.add(t))]
+    primary_terms = [t for t in primary_terms if not (t in seen or seen.add(t))]
+    return primary_terms, all_terms
+def matches_query(paper: dict, primary_terms: list[str], all_terms: list[str], require_all: bool = False) -> bool:
+    """Check if a paper matches the search query.
+    Args:
+        paper: Paper dictionary from bioRxiv API
+        primary_terms: Main search terms from user query
+        all_terms: All search terms including synonyms
+        require_all: If True, require ALL primary terms. If False, require ANY term.
+    Returns:
+        True if paper matches search criteria
+    """
+    # Get searchable text
+    title = paper.get("title", "").lower()
+    abstract = paper.get("abstract", "").lower()
+    searchable_text = f" {title} {abstract} "  # Add spaces for boundary matching
+    # DEBUG: Log paper being checked
+    paper_doi = paper.get("doi", "unknown")
+    logger.debug(f"🔍 Checking paper: {title[:60]}... (DOI: {paper_doi})")
+    if not searchable_text.strip():
+        logger.debug(f"  ❌ Rejected: No title/abstract")
+        return False
+    # For ALS specifically, need to be careful about word boundaries
+    has_any_match = False
+    matched_term = None
+    for term in all_terms:
+        # For short terms like "ALS", require word boundaries
+        if len(term) <= 3:
+            # Check for word boundary match
+            pattern = r'\b' + re.escape(term) + r'\b'
+            if re.search(pattern, searchable_text, re.IGNORECASE):
+                has_any_match = True
+                matched_term = term
+                break
+        else:
+            # For longer terms, can be more lenient
+            if term.lower() in searchable_text:
+                has_any_match = True
+                matched_term = term
+                break
+    if not has_any_match:
+        logger.debug(f"  ❌ Rejected: No term match. Terms searched: {all_terms[:3]}...")
+        return False
+    logger.debug(f"  ✅ Matched on term: '{matched_term}'")
+    # If we only need any match, we're done
+    if not require_all:
+        return True
+    # For require_all, check that all primary terms are present
+    # Allow for word boundaries to avoid partial matches
+    for term in primary_terms:
+        # Create pattern that matches the term as a whole word or part of hyphenated word
+        # This handles cases like "TDP-43" or "SOD1"
+        pattern = r'\b' + re.escape(term) + r'(?:\b|[-])'
+        if not re.search(pattern, searchable_text, re.IGNORECASE):
+            return False
+    return True
+@mcp.tool()
+async def search_preprints(
+    query: str,
+    server: str = "both",
+    max_results: int = 10,
+    days_back: int = 365
+) -> str:
+    """Search bioRxiv and medRxiv for ALS preprints. Returns recent preprints before peer review.
+    Args:
+        query: Search query (e.g., 'ALS TDP-43')
+        server: Which server to search - one of: biorxiv, medrxiv, both (default: both)
+        max_results: Maximum number of results (default: 10)
+        days_back: Number of days to look back (default: 365 - about 1 year)
+    """
+    try:
+        logger.info(f"🔎 Searching bioRxiv/medRxiv for: '{query}'")
+        logger.info(f"   Parameters: server={server}, max_results={max_results}, days_back={days_back}")
+        # Preprocess query for better matching
+        primary_terms, all_terms = preprocess_query(query)
+        logger.info(f"📝 Search terms: primary={primary_terms}, all={all_terms}")
+        # Calculate date range
+        end_date = datetime.now()
+        start_date = end_date - timedelta(days=days_back)
+        # Format dates for API (YYYY-MM-DD)
+        start_date_str = start_date.strftime("%Y-%m-%d")
+        end_date_str = end_date.strftime("%Y-%m-%d")
+        logger.info(f"📅 Date range: {start_date_str} to {end_date_str}")
+        # bioRxiv/medRxiv API endpoint
+        base_url = "https://api.biorxiv.org/details"
+        all_results = []
+        servers_to_search = []
+        if server in ["biorxiv", "both"]:
+            servers_to_search.append("biorxiv")
+        if server in ["medrxiv", "both"]:
+            servers_to_search.append("medrxiv")
+        # Use a custom HTTP client with proper timeout for bioRxiv
+        # Don't use shared client as it may have conflicting timeout settings
+        async with CustomHTTPClient(timeout=15.0) as client:
+            for srv in servers_to_search:
+                try:
+                    cursor = 0
+                    found_in_server = []
+                    max_iterations = 1  # Only check first page (100 papers) for much faster response
+                    iteration = 0
+                    while iteration < max_iterations:
+                        # Rate limiting
+                        await rate_limiter.wait()
+                        # Search by date range with cursor for pagination
+                        url = f"{base_url}/{srv}/{start_date_str}/{end_date_str}/{cursor}"
+                        logger.info(f"🌐 Querying {srv} API (page {iteration+1}, cursor={cursor})")
+                        logger.info(f"   URL: {url}")
+                        response = await client.get(url)
+                        response.raise_for_status()
+                        data = response.json()
+                        # Extract collection
+                        collection = data.get("collection", [])
+                        if not collection:
+                            logger.info(f"📭 No more results from {srv}")
+                            break
+                        logger.info(f"📦 Fetched {len(collection)} papers from API")
+                        # Show first few papers for debugging
+                        if iteration == 0 and collection:
+                            logger.info("   Sample papers from API:")
+                            for i, paper in enumerate(collection[:3]):
+                                logger.info(f"   {i+1}. {paper.get('title', 'No title')[:60]}...")
+                        # Filter papers using improved matching
+                        # Start with lenient matching (ANY term)
+                        logger.debug(f"🔍 Starting to filter {len(collection)} papers...")
+                        filtered = [
+                            paper for paper in collection
+                            if matches_query(paper, primary_terms, all_terms, require_all=False)
+                        ]
+                        logger.info(f"✅ Filtered results: {len(filtered)}/{len(collection)} papers matched")
+                        if len(filtered) > 0:
+                            logger.info("   Matched papers:")
+                            for i, paper in enumerate(filtered[:3]):
+                                logger.info(f"   {i+1}. {paper.get('title', 'No title')[:60]}...")
+                        found_in_server.extend(filtered)
+                        logger.info(f"📊 Running total for {srv}: {len(found_in_server)} papers")
+                        # Check if we have enough results
+                        if len(found_in_server) >= max_results:
+                            logger.info(f"Reached max_results limit ({max_results})")
+                            break
+                        # Continue searching if we haven't found enough
+                        if len(found_in_server) < 5 and iteration < max_iterations - 1:
+                            # Keep searching for more results
+                            pass
+                        elif len(found_in_server) > 0 and iteration >= 3:
+                            # Found some results after reasonable search
+                            logger.info(f"Found {len(found_in_server)} results after {iteration+1} pages")
+                            break
+                        # Check for more pages
+                        messages = data.get("messages", [])
+                        # The API returns "cursor" in messages for next page
+                        has_more = False
+                        for msg in messages:
+                            if "cursor=" in str(msg):
+                                try:
+                                    cursor_str = str(msg).split("cursor=")[1].split()[0]
+                                    next_cursor = int(cursor_str)
+                                    if next_cursor > cursor:
+                                        cursor = next_cursor
+                                        has_more = True
+                                        break
+                                except:
+                                    pass
+                        # Alternative: increment by collection size
+                        if not has_more:
+                            if len(collection) >= 100:
+                                cursor += len(collection)
+                            else:
+                                # Less than full page means we've reached the end
+                                break
+                        iteration += 1
+                    all_results.extend(found_in_server[:max_results])
+                    logger.info(f"🏁 Total results from {srv}: {len(found_in_server)} papers found")
+                except httpx.HTTPStatusError as e:
+                    logger.warning(f"Error searching {srv}: {e}")
+                    continue
+                except Exception as e:
+                    logger.warning(f"Unexpected error searching {srv}: {e}")
+                    continue
+        # If no results with lenient matching, provide helpful message
+        if not all_results:
+            logger.warning(f"⚠️ No preprints found for query: {query}")
+            # Provide suggestions for improving search
+            suggestions = []
+            if len(primary_terms) > 3:
+                suggestions.append("Try using fewer search terms")
+            if not any(term in ['als', 'amyotrophic lateral sclerosis', 'motor neuron'] for term in all_terms):
+                suggestions.append("Add 'ALS' or 'motor neuron disease' to your search")
+            if days_back < 365:
+                suggestions.append(f"Expand the time range beyond {days_back} days")
+            suggestion_text = ""
+            if suggestions:
+                suggestion_text = "\n\nSuggestions:\n" + "\n".join(f"- {s}" for s in suggestions)
+            return f"No preprints found for query: '{query}' in the last {days_back} days{suggestion_text}"
+        # Sort by date (most recent first)
+        all_results.sort(key=lambda x: x.get("date", ""), reverse=True)
+        # Limit results
+        all_results = all_results[:max_results]
+        logger.info(f"🎯 FINAL RESULTS: Returning {len(all_results)} preprints for '{query}'")
+        if all_results:
+            logger.info("   Top results:")
+            for i, paper in enumerate(all_results[:3], 1):
+                logger.info(f"   {i}. {paper.get('title', 'No title')[:60]}...")
+                logger.info(f"      DOI: {paper.get('doi', 'unknown')}, Date: {paper.get('date', 'unknown')}")
+        # Format results
+        result = f"Found {len(all_results)} preprints for query: '{query}'\n\n"
+        for i, paper in enumerate(all_results, 1):
+            title = paper.get("title", "No title")
+            doi = paper.get("doi", "Unknown")
+            date = paper.get("date", "Unknown")
+            authors = paper.get("authors", "Unknown authors")
+            authors_str = format_authors(authors, max_authors=3)
+            abstract = paper.get("abstract", "No abstract available")
+            category = paper.get("category", "")
+            server_name = "bioRxiv" if "biorxiv" in doi else "medRxiv"
+            result += f"{i}. **{title}**\n"
+            result += f"   DOI: {doi} | {server_name} | Posted: {date}\n"
+            result += f"   Authors: {authors_str}\n"
+            if category:
+                result += f"   Category: {category}\n"
+            result += f"   Abstract: {truncate_text(abstract, max_chars=300, suffix='')}\n"
+            result += f"   URL: https://doi.org/{doi}\n\n"
+        logger.info(f"Successfully retrieved {len(all_results)} preprints")
+        return result
+    except httpx.TimeoutException:
+        logger.error("bioRxiv/medRxiv API request timed out")
+        return "Error: bioRxiv/medRxiv API request timed out. Please try again."
+    except httpx.HTTPStatusError as e:
+        logger.error(f"bioRxiv/medRxiv API error: {e}")
+        return f"Error: bioRxiv/medRxiv API returned status code {e.response.status_code}"
+    except Exception as e:
+        logger.error(f"Unexpected error in search_preprints: {e}")
+        return f"Error searching preprints: {str(e)}"
+@mcp.tool()
+async def get_preprint_details(doi: str) -> str:
+    """Get full details for a specific bioRxiv/medRxiv preprint by DOI.
+    Args:
+        doi: The DOI of the preprint (e.g., '10.1101/2024.01.01.123456')
+    """
+    try:
+        logger.info(f"Getting details for DOI: {doi}")
+        # Ensure DOI is properly formatted
+        if not doi.startswith("10.1101/"):
+            doi = f"10.1101/{doi}"
+        # Determine server from DOI
+        # bioRxiv DOIs typically have format: 10.1101/YYYY.MM.DD.NNNNNN
+        # medRxiv DOIs are similar but the content determines the server
+        # Use shared HTTP client for connection pooling
+        client = get_http_client(timeout=30.0)
+        # Try the DOI endpoint
+        url = f"https://api.biorxiv.org/details/{doi}"
+        response = await client.get(url)
+        if response.status_code == 404:
+            # Try with both servers
+            for server in ["biorxiv", "medrxiv"]:
+                url = f"https://api.biorxiv.org/details/{server}/{doi}"
+                response = await client.get(url)
+                if response.status_code == 200:
+                    break
+            else:
+                return f"Preprint with DOI {doi} not found"
+        response.raise_for_status()
+        data = response.json()
+        collection = data.get("collection", [])
+        if not collection:
+            return f"No details found for DOI: {doi}"
+        # Get the first (and should be only) paper
+        paper = collection[0]
+        title = paper.get("title", "No title")
+        date = paper.get("date", "Unknown")
+        authors = paper.get("authors", "Unknown authors")
+        abstract = paper.get("abstract", "No abstract available")
+        category = paper.get("category", "")
+        server_name = paper.get("server", "Unknown")
+        result = f"**{title}**\n\n"
+        result += f"**DOI:** {doi}\n"
+        result += f"**Server:** {server_name}\n"
+        result += f"**Posted:** {date}\n"
+        if category:
+            result += f"**Category:** {category}\n"
+        result += f"**Authors:** {authors}\n\n"
+        result += f"**Abstract:**\n{abstract}\n\n"
+        result += f"**Full Text URL:** https://doi.org/{doi}\n"
+        return result
+    except httpx.HTTPStatusError as e:
+        logger.error(f"Error fetching preprint details: {e}")
+        return f"Error fetching preprint details: HTTP {e.response.status_code}"
+    except Exception as e:
+        logger.error(f"Unexpected error getting preprint details: {e}")
+        return f"Error getting preprint details: {str(e)}"
+if __name__ == "__main__":
+    mcp.run(transport="stdio")

servers/clinicaltrials_links.py ADDED Viewed

	@@ -0,0 +1,245 @@

+#!/usr/bin/env python3
+"""
+Simplified ClinicalTrials.gov Link Generator
+Provides direct links and known trials as fallback when AACT is unavailable
+"""
+from mcp.server.fastmcp import FastMCP
+import logging
+from typing import Optional
+from urllib.parse import quote_plus
+# Configure logging
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+mcp = FastMCP("clinicaltrials-links")
+# Known important ALS trials (updated periodically)
+KNOWN_ALS_TRIALS = {
+    "NCT05112094": {
+        "title": "Tofersen (ATLAS)",
+        "description": "SOD1-targeted antisense therapy for SOD1-ALS",
+        "status": "Active",
+        "sponsor": "Biogen"
+    },
+    "NCT04856982": {
+        "title": "HEALEY ALS Platform Trial",
+        "description": "Multiple drugs tested simultaneously",
+        "status": "Recruiting",
+        "sponsor": "Massachusetts General Hospital"
+    },
+    "NCT04768972": {
+        "title": "Ravulizumab",
+        "description": "Complement C5 inhibition",
+        "status": "Active",
+        "sponsor": "Alexion"
+    },
+    "NCT05370950": {
+        "title": "Pridopidine",
+        "description": "Sigma-1 receptor agonist",
+        "status": "Recruiting",
+        "sponsor": "Prilenia"
+    },
+    "NCT04632225": {
+        "title": "NurOwn",
+        "description": "MSC-NTF cells (mesenchymal stem cells)",
+        "status": "Active",
+        "sponsor": "BrainStorm Cell"
+    },
+    "NCT07204977": {
+        "title": "Acamprosate",
+        "description": "C9orf72 hexanucleotide repeat expansion treatment",
+        "status": "Recruiting",
+        "sponsor": "Mayo Clinic"
+    },
+    "NCT07161999": {
+        "title": "COYA 302",
+        "description": "Regulatory T-cell therapy",
+        "status": "Recruiting",
+        "sponsor": "Coya Therapeutics"
+    },
+    "NCT07023835": {
+        "title": "Usnoflast",
+        "description": "Anti-inflammatory for ALS",
+        "status": "Recruiting",
+        "sponsor": "Seelos Therapeutics"
+    }
+}
+@mcp.tool()
+async def get_trial_link(nct_id: str) -> str:
+    """Generate direct link to a ClinicalTrials.gov trial page.
+    Args:
+        nct_id: NCT identifier (e.g., 'NCT05112094')
+    """
+    nct_id = nct_id.upper()
+    url = f"https://clinicaltrials.gov/study/{nct_id}"
+    result = f"**Direct link to trial {nct_id}:**\n{url}\n\n"
+    # Add info if it's a known trial
+    if nct_id in KNOWN_ALS_TRIALS:
+        trial = KNOWN_ALS_TRIALS[nct_id]
+        result += f"**{trial['title']}**\n"
+        result += f"Description: {trial['description']}\n"
+        result += f"Status: {trial['status']}\n"
+        result += f"Sponsor: {trial['sponsor']}\n"
+    return result
+@mcp.tool()
+async def get_search_link(
+    condition: str = "ALS",
+    status: Optional[str] = None,
+    intervention: Optional[str] = None,
+    location: Optional[str] = None
+) -> str:
+    """Generate direct search link for ClinicalTrials.gov.
+    Args:
+        condition: Medical condition (default: ALS)
+        status: Trial status (recruiting, active, completed)
+        intervention: Treatment/drug name
+        location: Country or city
+    """
+    base_url = "https://clinicaltrials.gov/search"
+    params = []
+    # Add condition
+    params.append(f"cond={quote_plus(condition)}")
+    # Map status to ClinicalTrials.gov format
+    if status:
+        status_lower = status.lower()
+        if "recruit" in status_lower:
+            params.append("recrs=a")  # Recruiting
+        elif "active" in status_lower:
+            params.append("recrs=d")  # Active, not recruiting
+        elif "complet" in status_lower:
+            params.append("recrs=e")  # Completed
+    # Add intervention
+    if intervention:
+        params.append(f"intr={quote_plus(intervention)}")
+    # Add location
+    if location:
+        params.append(f"locn={quote_plus(location)}")
+    # Build URL
+    search_url = f"{base_url}?{'&'.join(params)}"
+    result = f"**Direct search on ClinicalTrials.gov:**\n\n"
+    result += f"Search parameters:\n"
+    result += f"- Condition: {condition}\n"
+    if status:
+        result += f"- Status: {status}\n"
+    if intervention:
+        result += f"- Intervention: {intervention}\n"
+    if location:
+        result += f"- Location: {location}\n"
+    result += f"\n🔗 **Search URL:** {search_url}\n"
+    result += f"\nClick the link above to see results on ClinicalTrials.gov"
+    return result
+@mcp.tool()
+async def get_known_als_trials(
+    status_filter: Optional[str] = None
+) -> str:
+    """Get list of known important ALS trials.
+    Args:
+        status_filter: Filter by status (recruiting, active, all)
+    """
+    result = "**Important ALS Clinical Trials:**\n\n"
+    if not KNOWN_ALS_TRIALS:
+        return "No known trials available in offline database."
+    count = 0
+    for nct_id, trial in KNOWN_ALS_TRIALS.items():
+        # Apply status filter if provided
+        if status_filter:
+            filter_lower = status_filter.lower()
+            trial_status = trial['status'].lower()
+            if filter_lower == "recruiting" and "recruit" not in trial_status:
+                continue
+            elif filter_lower == "active" and "active" not in trial_status:
+                continue
+            elif filter_lower == "completed" and "complet" not in trial_status:
+                continue
+        count += 1
+        result += f"{count}. **{trial['title']}** ({nct_id})\n"
+        result += f"   {trial['description']}\n"
+        result += f"   Status: {trial['status']} | Sponsor: {trial['sponsor']}\n"
+        result += f"   🔗 https://clinicaltrials.gov/study/{nct_id}\n\n"
+    if count == 0:
+        result += f"No trials found with status filter: {status_filter}\n"
+    else:
+        result += f"\n📌 *This is a curated list. For comprehensive search, use AACT database server.*"
+    return result
+@mcp.tool()
+async def get_trial_resources() -> str:
+    """Get helpful resources for finding clinical trials."""
+    resources = """**Clinical Trials Resources for ALS:**
+**Official Databases:**
+1. **ClinicalTrials.gov**: https://clinicaltrials.gov/search?cond=ALS
+   - Official US trials registry
+   - Most comprehensive for US trials
+2. **WHO ICTRP**: https://trialsearch.who.int/
+   - International trials from all countries
+   - Includes non-US trials
+3. **EU Clinical Trials Register**: https://www.clinicaltrialsregister.eu/
+   - European trials database
+**ALS-Specific Resources:**
+1. **Northeast ALS Consortium (NEALS)**: https://www.neals.org/
+   - Network of ALS clinical trial sites
+   - Trial matching service
+2. **ALS Therapy Development Institute**: https://www.als.net/clinical-trials/
+   - Independent ALS research organization
+   - Trial tracker and updates
+3. **I AM ALS Registry**: https://iamals.org/get-help/clinical-trials/
+   - Patient-focused trial information
+   - Trial matching assistance
+**Major ALS Clinical Centers:**
+- Massachusetts General Hospital (Healey Center)
+- Johns Hopkins ALS Clinic
+- Mayo Clinic ALS Center
+- Cleveland Clinic Lou Ruvo Center
+- UCSF ALS Center
+**Tips for Finding Trials:**
+1. Use condition terms: "ALS", "Amyotrophic Lateral Sclerosis", "Motor Neuron Disease"
+2. Check recruiting AND not-yet-recruiting trials
+3. Consider trials at different phases (1, 2, 3)
+4. Look for platform trials testing multiple drugs
+5. Contact trial coordinators directly for eligibility
+**Note:** For programmatic access to trial data, use the AACT database server which provides complete ClinicalTrials.gov data without API restrictions.
+"""
+    return resources
+if __name__ == "__main__":
+    mcp.run(transport="stdio")

servers/elevenlabs_server.py ADDED Viewed

	@@ -0,0 +1,561 @@

+#!/usr/bin/env python3
+"""
+ElevenLabs MCP Server for Voice Capabilities
+Provides text-to-speech and speech-to-text for ALS Research Agent
+This server enables voice accessibility features crucial for ALS patients
+who may have limited mobility but retain cognitive function.
+"""
+from mcp.server.fastmcp import FastMCP
+import httpx
+import logging
+import os
+import base64
+import json
+from typing import Optional, Dict, Any
+from pathlib import Path
+import sys
+# Add parent directory to path for shared imports
+sys.path.insert(0, str(Path(__file__).parent.parent))
+from shared import config
+from shared.http_client import get_http_client
+# Configure logging
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+# Initialize MCP server
+mcp = FastMCP("elevenlabs-voice")
+# ElevenLabs API configuration
+ELEVENLABS_API_KEY = os.getenv("ELEVENLABS_API_KEY")
+ELEVENLABS_API_BASE = "https://api.elevenlabs.io/v1"
+# Default voice settings optimized for clarity (important for ALS patients)
+DEFAULT_VOICE_ID = os.getenv("ELEVENLABS_VOICE_ID", "21m00Tcm4TlvDq8ikWAM")  # Rachel voice (clear and calm)
+DEFAULT_MODEL = "eleven_turbo_v2_5"  # Turbo v2.5 - Fastest model available (40% faster than v2)
+# Voice settings for accessibility
+VOICE_SETTINGS = {
+    "stability": 0.5,  # Balanced for speed and clarity (turbo model)
+    "similarity_boost": 0.5,  # Balanced setting for faster processing
+    "style": 0.0,  # Neutral style for clarity
+    "use_speaker_boost": True  # Enhanced clarity
+}
+@mcp.tool()
+async def text_to_speech(
+    text: str,
+    voice_id: Optional[str] = None,
+    output_format: str = "mp3_44100_128",
+    speed: float = 1.0
+) -> str:
+    """Convert text to speech optimized for ALS patients.
+    Args:
+        text: Text to convert to speech (research findings, paper summaries, etc.)
+        voice_id: ElevenLabs voice ID (defaults to clear, calm voice)
+        output_format: Audio format (mp3_44100_128, mp3_44100_192, pcm_16000, etc.)
+        speed: Speech rate (0.5-2.0, default 1.0 - can be slower for clarity)
+    Returns:
+        Base64 encoded audio data and metadata
+    """
+    try:
+        if not ELEVENLABS_API_KEY:
+            return json.dumps({
+                "status": "error",
+                "error": "ELEVENLABS_API_KEY not configured",
+                "message": "Please set your ElevenLabs API key in .env file"
+            }, indent=2)
+        # Limit text length to avoid ElevenLabs API timeouts
+        # Testing shows 2500 chars is safe, 5000 chars times out
+        max_length = 2500
+        if len(text) > max_length:
+            logger.warning(f"Text truncated from {len(text)} to {max_length} characters to avoid timeout")
+            # Try to truncate at a sentence boundary
+            truncated = text[:max_length]
+            last_period = truncated.rfind('.')
+            last_newline = truncated.rfind('\n')
+            # Use the latest sentence/paragraph boundary
+            boundary = max(last_period, last_newline)
+            if boundary > max_length - 500:  # If there's a boundary in the last 500 chars
+                text = truncated[:boundary + 1]
+            else:
+                text = truncated + "..."
+        voice_id = voice_id or DEFAULT_VOICE_ID
+        # Prepare the request
+        url = f"{ELEVENLABS_API_BASE}/text-to-speech/{voice_id}"
+        headers = {
+            "xi-api-key": ELEVENLABS_API_KEY,
+            "Content-Type": "application/json"
+        }
+        # Adjust voice settings for speed
+        adjusted_settings = VOICE_SETTINGS.copy()
+        if speed < 1.0:
+            # Slower speech - increase stability for clarity
+            adjusted_settings["stability"] = min(1.0, adjusted_settings["stability"] + 0.1)
+        payload = {
+            "text": text,
+            "model_id": DEFAULT_MODEL,
+            "voice_settings": adjusted_settings
+        }
+        logger.info(f"Converting text to speech: {len(text)} characters")
+        # Set timeout based on text length (with 2500 char limit, 45s should be enough)
+        timeout = 45.0
+        logger.info(f"Using timeout of {timeout} seconds")
+        # Use shared HTTP client for connection pooling
+        client = get_http_client(timeout=timeout)
+        response = await client.post(url, json=payload, headers=headers)
+        response.raise_for_status()
+        # Get the audio data
+        audio_data = response.content
+        # Encode to base64 for transmission
+        audio_base64 = base64.b64encode(audio_data).decode('utf-8')
+        # Return structured response
+        result = {
+            "status": "success",
+            "audio_base64": audio_base64,
+            "format": output_format,
+            "duration_estimate": len(text) / 150 * 60,  # Rough estimate: 150 words/min
+            "text_length": len(text),
+            "voice_id": voice_id,
+            "message": "Audio generated successfully. Use the audio_base64 field to play the audio."
+        }
+        logger.info(f"Successfully generated {len(audio_data)} bytes of audio")
+        return json.dumps(result, indent=2)
+    except httpx.HTTPStatusError as e:
+        logger.error(f"ElevenLabs API error: {e}")
+        if e.response.status_code == 401:
+            return json.dumps({
+                "status": "error",
+                "error": "Authentication failed",
+                "message": "Check your ELEVENLABS_API_KEY"
+            }, indent=2)
+        elif e.response.status_code == 429:
+            return json.dumps({
+                "status": "error",
+                "error": "Rate limit exceeded",
+                "message": "Please wait before trying again"
+            }, indent=2)
+        else:
+            return json.dumps({
+                "status": "error",
+                "error": f"API error: {e.response.status_code}",
+                "message": str(e)
+            }, indent=2)
+    except Exception as e:
+        logger.error(f"Unexpected error in text_to_speech: {e}")
+        return json.dumps({
+            "status": "error",
+            "error": "Text-to-speech error",
+            "message": str(e)
+        }, indent=2)
+@mcp.tool()
+async def create_audio_summary(
+    content: str,
+    summary_type: str = "research",
+    max_duration: int = 60
+) -> str:
+    """Create an audio summary of research content optimized for listening.
+    This tool reformats technical content into a more listenable format
+    before converting to speech - important for complex medical research.
+    Args:
+        content: Research content to summarize (paper abstract, findings, etc.)
+        summary_type: Type of summary - "research", "clinical", "patient-friendly"
+        max_duration: Target duration in seconds (affects summary length)
+    Returns:
+        Audio summary with both text and audio versions
+    """
+    try:
+        # Calculate target word count (assuming 150 words per minute)
+        target_words = int((max_duration / 60) * 150)
+        # Process content based on summary type
+        if summary_type == "patient-friendly":
+            # Simplify medical jargon for patients/families
+            processed_text = _simplify_medical_content(content, target_words)
+        elif summary_type == "clinical":
+            # Focus on clinical relevance
+            processed_text = _extract_clinical_relevance(content, target_words)
+        else:  # research
+            # Standard research summary
+            processed_text = _create_research_summary(content, target_words)
+        # Add intro for context
+        intro = "Here's your audio research summary: "
+        final_text = intro + processed_text
+        # Convert to speech
+        tts_result = await text_to_speech(
+            text=final_text,
+            speed=0.95  # Slightly slower for complex content
+        )
+        # Parse the TTS result
+        tts_data = json.loads(tts_result)
+        if tts_data.get("status") != "success":
+            return tts_result  # Return error from TTS
+        # Return enhanced result
+        result = {
+            "status": "success",
+            "audio_base64": tts_data["audio_base64"],
+            "text_summary": processed_text,
+            "summary_type": summary_type,
+            "word_count": len(processed_text.split()),
+            "estimated_duration": tts_data["duration_estimate"],
+            "format": tts_data["format"],
+            "message": f"Audio summary created: {summary_type} format, ~{int(tts_data['duration_estimate'])} seconds"
+        }
+        return json.dumps(result, indent=2)
+    except Exception as e:
+        logger.error(f"Error creating audio summary: {e}")
+        return json.dumps({
+            "status": "error",
+            "error": "Summary creation error",
+            "message": str(e)
+        }, indent=2)
+@mcp.tool()
+async def list_voices() -> str:
+    """List available voices optimized for medical/research content.
+    Returns voices suitable for clear pronunciation of medical terminology.
+    """
+    try:
+        if not ELEVENLABS_API_KEY:
+            return json.dumps({
+                "status": "error",
+                "error": "ELEVENLABS_API_KEY not configured",
+                "message": "Please set your ElevenLabs API key in .env file"
+            }, indent=2)
+        url = f"{ELEVENLABS_API_BASE}/voices"
+        headers = {"xi-api-key": ELEVENLABS_API_KEY}
+        # Use shared HTTP client for connection pooling
+        client = get_http_client(timeout=10.0)
+        response = await client.get(url, headers=headers)
+        response.raise_for_status()
+        data = response.json()
+        voices = data.get("voices", [])
+        # Filter and rank voices for medical content
+        recommended_voices = []
+        for voice in voices:
+            # Prefer clear, professional voices
+            labels = voice.get("labels", {})
+            if any(label in ["clear", "professional", "narration"] for label in labels.values()):
+                recommended_voices.append({
+                    "voice_id": voice["voice_id"],
+                    "name": voice["name"],
+                    "preview_url": voice.get("preview_url"),
+                    "description": voice.get("description", ""),
+                    "recommended_for": "medical_content"
+                })
+        # Add all other voices
+        other_voices = []
+        for voice in voices:
+            if voice["voice_id"] not in [v["voice_id"] for v in recommended_voices]:
+                other_voices.append({
+                    "voice_id": voice["voice_id"],
+                    "name": voice["name"],
+                    "preview_url": voice.get("preview_url"),
+                    "description": voice.get("description", "")
+                })
+        result = {
+            "status": "success",
+            "recommended_voices": recommended_voices[:5],  # Top 5 recommended
+            "other_voices": other_voices[:10],  # Limit for clarity
+            "total_voices": len(voices),
+            "message": "Recommended voices are optimized for clear medical terminology pronunciation"
+        }
+        return json.dumps(result, indent=2)
+    except Exception as e:
+        logger.error(f"Error listing voices: {e}")
+        return json.dumps({
+            "status": "error",
+            "error": "Failed to list voices",
+            "message": str(e)
+        }, indent=2)
+@mcp.tool()
+async def pronunciation_guide(
+    medical_terms: list[str],
+    include_audio: bool = True
+) -> str:
+    """Generate pronunciation guide for medical terms.
+    Critical for ALS patients/caregivers learning about complex terminology.
+    Args:
+        medical_terms: List of medical terms to pronounce
+        include_audio: Whether to include audio pronunciation
+    Returns:
+        Pronunciation guide with optional audio
+    """
+    try:
+        results = []
+        for term in medical_terms[:10]:  # Limit to prevent long processing
+            # Create phonetic breakdown
+            phonetic = _get_phonetic_spelling(term)
+            # Create pronunciation text
+            pronunciation_text = f"{term}. {phonetic}. {term}."
+            result_entry = {
+                "term": term,
+                "phonetic": phonetic
+            }
+            if include_audio:
+                # Generate audio
+                tts_result = await text_to_speech(
+                    text=pronunciation_text,
+                    speed=0.8  # Slower for clarity
+                )
+                tts_data = json.loads(tts_result)
+                if tts_data.get("status") == "success":
+                    result_entry["audio_base64"] = tts_data["audio_base64"]
+            results.append(result_entry)
+        return json.dumps({
+            "status": "success",
+            "pronunciations": results,
+            "message": f"Generated pronunciation guide for {len(results)} terms"
+        }, indent=2)
+    except Exception as e:
+        logger.error(f"Error creating pronunciation guide: {e}")
+        return json.dumps({
+            "status": "error",
+            "error": "Pronunciation guide error",
+            "message": str(e)
+        }, indent=2)
+# Helper functions for content processing
+def _simplify_medical_content(content: str, target_words: int) -> str:
+    """Simplify medical content for patient understanding."""
+    # This would ideally use NLP, but for now, basic simplification
+    # First, strip references for cleaner audio
+    content = _strip_references(content)
+    # Common medical term replacements
+    replacements = {
+        "amyotrophic lateral sclerosis": "ALS or Lou Gehrig's disease",
+        "motor neurons": "nerve cells that control muscles",
+        "neurodegeneration": "nerve cell damage",
+        "pathogenesis": "disease development",
+        "etiology": "cause",
+        "prognosis": "expected outcome",
+        "therapeutic": "treatment",
+        "pharmacological": "drug-based",
+        "intervention": "treatment",
+        "mortality": "death rate",
+        "morbidity": "illness rate"
+    }
+    simplified = content.lower()
+    for term, replacement in replacements.items():
+        simplified = simplified.replace(term, replacement)
+    # Truncate to target length
+    words = simplified.split()
+    if len(words) > target_words:
+        words = words[:target_words]
+        simplified = " ".join(words) + "..."
+    return simplified.capitalize()
+def _extract_clinical_relevance(content: str, target_words: int) -> str:
+    """Extract clinically relevant information."""
+    # Focus on treatment, outcomes, and practical implications
+    # First, strip references for cleaner audio
+    content = _strip_references(content)
+    # Look for key clinical phrases
+    clinical_markers = [
+        "treatment", "therapy", "outcome", "survival", "progression",
+        "clinical trial", "efficacy", "safety", "adverse", "benefit",
+        "patient", "dose", "administration"
+    ]
+    sentences = content.split(". ")
+    relevant_sentences = []
+    for sentence in sentences:
+        if any(marker in sentence.lower() for marker in clinical_markers):
+            relevant_sentences.append(sentence)
+    result = ". ".join(relevant_sentences)
+    # Truncate to target length
+    words = result.split()
+    if len(words) > target_words:
+        words = words[:target_words]
+        result = " ".join(words) + "..."
+    return result
+def _create_research_summary(content: str, target_words: int) -> str:
+    """Create a research-focused summary."""
+    # Extract key findings and implications
+    # First, strip references section if present
+    content = _strip_references(content)
+    # Simply truncate for now (could be enhanced with NLP)
+    words = content.split()
+    if len(words) > target_words:
+        words = words[:target_words]
+        content = " ".join(words) + "..."
+    return content
+def _strip_references(content: str) -> str:
+    """Remove references section and citations from content for audio reading."""
+    import re
+    # Extract only synthesis content if it's marked
+    synthesis_match = re.search(r'✅\s*SYNTHESIS:?\s*(.*?)(?=##?\s*References|##?\s*Bibliography|$)',
+                               content, flags=re.DOTALL | re.IGNORECASE)
+    if synthesis_match:
+        content = synthesis_match.group(1)
+    # Remove References section (multiple possible formats)
+    patterns_to_remove = [
+        r'##?\s*References.*$',  # ## References or # References to end
+        r'##?\s*Bibliography.*$',  # Bibliography section
+        r'##?\s*Citations.*$',  # Citations section
+        r'##?\s*Works Cited.*$',  # Works Cited section
+        r'##?\s*Key References.*$',  # Key References section
+    ]
+    for pattern in patterns_to_remove:
+        content = re.sub(pattern, '', content, flags=re.DOTALL | re.IGNORECASE)
+    # Remove inline citations like [1], [2,3], [PMID: 12345678]
+    content = re.sub(r'\[[\d,\s]+\]', '', content)  # [1], [2,3], etc.
+    content = re.sub(r'\[PMID:\s*\d+\]', '', content)  # [PMID: 12345678]
+    content = re.sub(r'\[NCT\d+\]', '', content)  # [NCT12345678]
+    # Remove URLs for cleaner audio
+    content = re.sub(r'https?://[^\s\)]+', '', content)
+    content = re.sub(r'www\.[^\s\)]+', '', content)
+    # Remove PMID/DOI/NCT references
+    content = re.sub(r'PMID:\s*\d+', '', content)
+    content = re.sub(r'DOI:\s*[^\s]+', '', content)
+    content = re.sub(r'NCT\d{8}', '', content)
+    # Remove markdown formatting that sounds awkward in audio
+    content = re.sub(r'\*\*(.*?)\*\*', r'\1', content)  # Remove bold
+    content = re.sub(r'\*(.*?)\*', r'\1', content)  # Remove italic
+    content = re.sub(r'`(.*?)`', r'\1', content)  # Remove inline code
+    content = re.sub(r'#{1,6}\s*', '', content)  # Remove headers
+    content = re.sub(r'^[-*+]\s+', '', content, flags=re.MULTILINE)  # Remove bullet points
+    content = re.sub(r'^\d+\.\s+', '', content, flags=re.MULTILINE)  # Remove numbered lists
+    # Replace markdown links with just the text
+    content = re.sub(r'\[([^\]]+)\]\([^\)]+\)', r'\1', content)
+    # Clean up extra whitespace
+    content = re.sub(r'\s+', ' ', content)
+    content = re.sub(r'\n{3,}', '\n\n', content)
+    return content.strip()
+def _get_phonetic_spelling(term: str) -> str:
+    """Generate phonetic spelling for medical terms."""
+    # Basic phonetic rules for medical terms
+    # This could be enhanced with a medical pronunciation dictionary
+    phonetic_map = {
+        "amyotrophic": "AM-ee-oh-TROH-fik",
+        "lateral": "LAT-er-al",
+        "sclerosis": "skleh-ROH-sis",
+        "tdp-43": "T-D-P forty-three",
+        "riluzole": "RIL-you-zole",
+        "edaravone": "ed-AR-a-vone",
+        "tofersen": "TOE-fer-sen",
+        "neurofilament": "NUR-oh-FIL-a-ment",
+        "astrocyte": "AS-tro-site",
+        "oligodendrocyte": "oh-li-go-DEN-dro-site"
+    }
+    term_lower = term.lower()
+    if term_lower in phonetic_map:
+        return phonetic_map[term_lower]
+    # Basic syllable breakdown for unknown terms
+    # This is very simplified and could be improved
+    syllables = []
+    current = ""
+    for char in term:
+        if char in "aeiouAEIOU" and current:
+            syllables.append(current + char)
+            current = ""
+        else:
+            current += char
+    if current:
+        syllables.append(current)
+    return "-".join(syllables).upper()
+if __name__ == "__main__":
+    # Check for API key
+    if not ELEVENLABS_API_KEY:
+        logger.warning("ELEVENLABS_API_KEY not set in environment")
+        logger.warning("Voice features will be limited without API key")
+        logger.info("Get your API key at: https://elevenlabs.io")
+    # Run the MCP server
+    mcp.run(transport="stdio")

servers/fetch_server.py ADDED Viewed

	@@ -0,0 +1,206 @@

+# fetch_server.py
+from mcp.server.fastmcp import FastMCP
+import httpx
+from bs4 import BeautifulSoup
+from urllib.parse import urlparse
+import logging
+import sys
+from pathlib import Path
+# Add parent directory to path for shared imports
+sys.path.insert(0, str(Path(__file__).parent.parent))
+from shared import (
+    config,
+    clean_whitespace,
+    truncate_text
+)
+from shared.http_client import get_http_client
+# Configure logging
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+mcp = FastMCP("fetch-server")
+def validate_url(url: str) -> tuple[bool, str]:
+    """Validate URL for security concerns. Returns (is_valid, error_message)"""
+    try:
+        parsed = urlparse(url)
+        # Check scheme using shared config
+        if parsed.scheme not in config.security.allowed_schemes:
+            return False, f"Invalid URL scheme. Only {', '.join(config.security.allowed_schemes)} are allowed."
+        # Check for blocked hosts (SSRF protection)
+        hostname = parsed.hostname
+        if not hostname:
+            return False, "Invalid URL: no hostname found."
+        # Use shared security config for SSRF checks
+        if config.security.is_private_ip(hostname):
+            return False, "Access to localhost/private IPs is not allowed."
+        return True, ""
+    except Exception as e:
+        return False, f"Invalid URL: {str(e)}"
+def parse_clinical_trial_page(soup: BeautifulSoup, url: str) -> str:
+    """Parse ClinicalTrials.gov trial detail page for structured data."""
+    # Check if this is a ClinicalTrials.gov page
+    if "clinicaltrials.gov" not in url.lower():
+        return None
+    # Extract NCT ID from URL
+    import re
+    nct_match = re.search(r'NCT\d{8}', url)
+    nct_id = nct_match.group() if nct_match else "Unknown"
+    # Try to extract key trial information
+    trial_info = []
+    trial_info.append(f"**NCT ID:** {nct_id}")
+    trial_info.append(f"**URL:** {url}")
+    # Look for title
+    title = soup.find('h1')
+    if title:
+        trial_info.append(f"**Title:** {title.get_text(strip=True)}")
+    # Look for status (various patterns)
+    status_patterns = [
+        soup.find('span', string=re.compile(r'Recruiting|Active|Completed|Enrolling', re.I)),
+        soup.find('div', string=re.compile(r'Recruitment Status', re.I))
+    ]
+    for pattern in status_patterns:
+        if pattern:
+            status_text = pattern.get_text(strip=True) if hasattr(pattern, 'get_text') else str(pattern)
+            trial_info.append(f"**Status:** {status_text}")
+            break
+    # Look for study description
+    desc_section = soup.find('div', {'class': re.compile('description', re.I)})
+    if desc_section:
+        desc_text = desc_section.get_text(strip=True)[:500]
+        trial_info.append(f"**Description:** {desc_text}...")
+    # Look for conditions
+    conditions = soup.find_all(string=re.compile(r'Condition', re.I))
+    if conditions:
+        for cond in conditions[:1]:  # Just first mention
+            parent = cond.parent
+            if parent:
+                trial_info.append(f"**Condition:** {parent.get_text(strip=True)[:200]}")
+                break
+    # Look for interventions
+    interventions = soup.find_all(string=re.compile(r'Intervention', re.I))
+    if interventions:
+        for inter in interventions[:1]:  # Just first mention
+            parent = inter.parent
+            if parent:
+                trial_info.append(f"**Intervention:** {parent.get_text(strip=True)[:200]}")
+                break
+    # Look for sponsor
+    sponsor = soup.find(string=re.compile(r'Sponsor', re.I))
+    if sponsor and sponsor.parent:
+        trial_info.append(f"**Sponsor:** {sponsor.parent.get_text(strip=True)[:100]}")
+    # Locations/Sites
+    locations = soup.find_all(string=re.compile(r'Location|Site', re.I))
+    if locations:
+        location_texts = []
+        for loc in locations[:3]:  # First 3 locations
+            if loc.parent:
+                location_texts.append(loc.parent.get_text(strip=True)[:50])
+        if location_texts:
+            trial_info.append(f"**Locations:** {', '.join(location_texts)}")
+    if len(trial_info) > 2:  # If we found meaningful data
+        return "\n\n".join(trial_info) + "\n\n**Note:** This is extracted from the trial webpage. Some details may be incomplete due to page structure variations."
+    return None
+@mcp.tool()
+async def fetch_url(url: str, extract_text_only: bool = True) -> str:
+    """Fetch content from a URL (paper abstract page, news article, etc.).
+    Args:
+        url: URL to fetch
+        extract_text_only: Extract only main text content (default: True)
+    """
+    try:
+        logger.info(f"Fetching URL: {url}")
+        # Validate URL
+        is_valid, error_msg = validate_url(url)
+        if not is_valid:
+            logger.warning(f"URL validation failed: {error_msg}")
+            return f"Error: {error_msg}"
+        # Use shared HTTP client for connection pooling
+        client = get_http_client(timeout=config.api.timeout)
+        response = await client.get(url, headers={
+            "User-Agent": config.api.user_agent
+        })
+        response.raise_for_status()
+        # Check content size using shared config
+        content_length = response.headers.get('content-length')
+        if content_length and int(content_length) > config.content_limits.max_content_size:
+            logger.warning(f"Content too large: {content_length} bytes")
+            return f"Error: Content size ({content_length} bytes) exceeds maximum allowed size of {config.content_limits.max_content_size} bytes"
+        # Check actual content size
+        if len(response.content) > config.content_limits.max_content_size:
+            logger.warning(f"Content too large: {len(response.content)} bytes")
+            return f"Error: Content size exceeds maximum allowed size of {config.content_limits.max_content_size} bytes"
+        if extract_text_only:
+            soup = BeautifulSoup(response.text, 'html.parser')
+            # Check if this is a clinical trial page and try enhanced parsing
+            trial_data = parse_clinical_trial_page(soup, url)
+            if trial_data:
+                logger.info(f"Successfully parsed clinical trial page: {url}")
+                return trial_data
+            # Otherwise, do standard text extraction
+            # Remove script and style elements
+            for script in soup(["script", "style", "meta", "link"]):
+                script.decompose()
+            # Get text
+            text = soup.get_text()
+            # Clean up whitespace using shared utility
+            text = clean_whitespace(text)
+            # Limit to reasonable size for LLM context using shared utility
+            text = truncate_text(text, max_chars=config.content_limits.max_text_chars)
+            logger.info(f"Successfully fetched and extracted text from {url}")
+            return text
+        else:
+            # Return raw HTML, but still limit size using shared utility
+            html = truncate_text(response.text, max_chars=config.content_limits.max_text_chars)
+            logger.info(f"Successfully fetched raw HTML from {url}")
+            return html
+    except httpx.TimeoutException:
+        logger.error(f"Request to {url} timed out")
+        return f"Error: Request timed out after {config.api.timeout} seconds"
+    except httpx.HTTPStatusError as e:
+        logger.error(f"HTTP error fetching {url}: {e}")
+        return f"Error: HTTP {e.response.status_code} - {e.response.reason_phrase}"
+    except httpx.RequestError as e:
+        logger.error(f"Request error fetching {url}: {e}")
+        return f"Error: Failed to fetch URL - {str(e)}"
+    except Exception as e:
+        logger.error(f"Unexpected error fetching {url}: {e}")
+        return f"Error: {str(e)}"
+if __name__ == "__main__":
+    mcp.run(transport="stdio")

servers/llamaindex_server.py ADDED Viewed

	@@ -0,0 +1,729 @@

+#!/usr/bin/env python3
+"""
+LlamaIndex MCP Server for Research Memory and RAG
+Provides persistent memory and semantic search capabilities for ALS Research Agent
+This server enables the agent to remember all research it encounters, build
+knowledge over time, and discover connections between papers.
+"""
+from mcp.server.fastmcp import FastMCP
+import logging
+import os
+import json
+import hashlib
+from typing import Optional, List, Dict, Any
+from pathlib import Path
+import sys
+from datetime import datetime
+import asyncio
+# Add parent directory to path for shared imports
+sys.path.insert(0, str(Path(__file__).parent.parent))
+from shared import config
+# Configure logging
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+# Initialize MCP server
+mcp = FastMCP("llamaindex-rag")
+# Import LlamaIndex components (will be installed)
+try:
+    from llama_index.core import (
+        VectorStoreIndex,
+        Document,
+        StorageContext,
+        Settings,
+        load_index_from_storage
+    )
+    from llama_index.core.node_parser import SentenceSplitter
+    from llama_index.vector_stores.chroma import ChromaVectorStore
+    from llama_index.embeddings.huggingface import HuggingFaceEmbedding
+    import chromadb
+    LLAMAINDEX_AVAILABLE = True
+except ImportError:
+    LLAMAINDEX_AVAILABLE = False
+    logger.warning("LlamaIndex not installed. Install with: pip install llama-index chromadb sentence-transformers")
+# Configuration
+CHROMA_DB_PATH = os.getenv("CHROMA_DB_PATH", "./chroma_db")
+EMBED_MODEL = os.getenv("LLAMAINDEX_EMBED_MODEL", "dmis-lab/biobert-base-cased-v1.2")
+CHUNK_SIZE = int(os.getenv("LLAMAINDEX_CHUNK_SIZE", "1024"))
+CHUNK_OVERLAP = int(os.getenv("LLAMAINDEX_CHUNK_OVERLAP", "200"))
+# Global index storage
+research_index = None
+chroma_client = None
+collection = None
+papers_metadata = {}  # Store paper metadata separately
+class ResearchMemoryManager:
+    """Manages persistent research memory using LlamaIndex and ChromaDB"""
+    def __init__(self):
+        self.index = None
+        self.chroma_client = None
+        self.collection = None
+        self.metadata_path = Path(CHROMA_DB_PATH) / "metadata.json"
+        if LLAMAINDEX_AVAILABLE:
+            self._initialize_index()
+    def _initialize_index(self):
+        """Initialize or load existing index from ChromaDB"""
+        try:
+            # Create directory if it doesn't exist
+            Path(CHROMA_DB_PATH).mkdir(parents=True, exist_ok=True)
+            # Initialize ChromaDB client
+            self.chroma_client = chromadb.PersistentClient(path=CHROMA_DB_PATH)
+            # Get or create collection
+            try:
+                self.collection = self.chroma_client.get_collection("als_research")
+                logger.info(f"Loaded existing ChromaDB collection with {self.collection.count()} papers")
+            except:
+                self.collection = self.chroma_client.create_collection("als_research")
+                logger.info("Created new ChromaDB collection")
+            # Initialize embedding model - prefer biomedical models
+            try:
+                embed_model = HuggingFaceEmbedding(
+                    model_name=EMBED_MODEL,
+                    cache_folder="./embed_cache"
+                )
+                logger.info(f"Using embedding model: {EMBED_MODEL}")
+            except Exception as e:
+                logger.warning(f"Failed to load {EMBED_MODEL}, falling back to default")
+                embed_model = HuggingFaceEmbedding(
+                    model_name="sentence-transformers/all-MiniLM-L6-v2",
+                    cache_folder="./embed_cache"
+                )
+            # Configure settings
+            Settings.embed_model = embed_model
+            Settings.chunk_size = CHUNK_SIZE
+            Settings.chunk_overlap = CHUNK_OVERLAP
+            # Initialize vector store
+            vector_store = ChromaVectorStore(chroma_collection=self.collection)
+            storage_context = StorageContext.from_defaults(vector_store=vector_store)
+            # Create or load index
+            if self.collection.count() > 0:
+                # Load existing index
+                self.index = VectorStoreIndex.from_vector_store(
+                    vector_store,
+                    storage_context=storage_context
+                )
+                logger.info("Loaded existing vector index")
+            else:
+                # Create new index
+                self.index = VectorStoreIndex(
+                    [],
+                    storage_context=storage_context
+                )
+                logger.info("Created new vector index")
+            # Load metadata
+            self._load_metadata()
+        except Exception as e:
+            logger.error(f"Failed to initialize index: {e}")
+            self.index = None
+    def _load_metadata(self):
+        """Load paper metadata from disk"""
+        global papers_metadata
+        if self.metadata_path.exists():
+            try:
+                with open(self.metadata_path, 'r') as f:
+                    papers_metadata = json.load(f)
+                logger.info(f"Loaded metadata for {len(papers_metadata)} papers")
+            except Exception as e:
+                logger.error(f"Failed to load metadata: {e}")
+                papers_metadata = {}
+        else:
+            papers_metadata = {}
+    def _save_metadata(self):
+        """Save paper metadata to disk"""
+        try:
+            with open(self.metadata_path, 'w') as f:
+                json.dump(papers_metadata, f, indent=2, default=str)
+        except Exception as e:
+            logger.error(f"Failed to save metadata: {e}")
+    def generate_paper_id(self, title: str, doi: Optional[str] = None) -> str:
+        """Generate unique ID for a paper"""
+        if doi:
+            return hashlib.md5(doi.encode()).hexdigest()
+        return hashlib.md5(title.lower().encode()).hexdigest()
+    async def index_paper(
+        self,
+        title: str,
+        abstract: str,
+        authors: List[str],
+        doi: Optional[str] = None,
+        journal: Optional[str] = None,
+        year: Optional[int] = None,
+        findings: Optional[str] = None,
+        url: Optional[str] = None,
+        paper_type: str = "research"
+    ) -> Dict[str, Any]:
+        """Index a research paper with metadata"""
+        if not self.index:
+            return {"status": "error", "message": "Index not initialized"}
+        # Generate unique ID
+        paper_id = self.generate_paper_id(title, doi)
+        # Check if already indexed
+        if paper_id in papers_metadata:
+            return {
+                "status": "already_indexed",
+                "paper_id": paper_id,
+                "title": title,
+                "message": "Paper already in research memory"
+            }
+        # Prepare document text
+        doc_text = f"Title: {title}\n\n"
+        doc_text += f"Authors: {', '.join(authors)}\n\n"
+        if journal:
+            doc_text += f"Journal: {journal}\n"
+        if year:
+            doc_text += f"Year: {year}\n\n"
+        doc_text += f"Abstract: {abstract}\n\n"
+        if findings:
+            doc_text += f"Key Findings: {findings}\n\n"
+        # Create document with metadata (ChromaDB only accepts strings, not lists)
+        metadata = {
+            "paper_id": paper_id,
+            "title": title,
+            "authors": ", ".join(authors) if authors else "",  # Convert list to string
+            "doi": doi,
+            "journal": journal,
+            "year": year,
+            "url": url,
+            "paper_type": paper_type,
+            "indexed_at": datetime.now().isoformat()
+        }
+        document = Document(
+            text=doc_text,
+            metadata=metadata
+        )
+        try:
+            # Add to index
+            self.index.insert(document)
+            # Store metadata
+            papers_metadata[paper_id] = metadata
+            self._save_metadata()
+            logger.info(f"Indexed paper: {title}")
+            return {
+                "status": "success",
+                "paper_id": paper_id,
+                "title": title,
+                "message": f"Successfully indexed paper into research memory"
+            }
+        except Exception as e:
+            logger.error(f"Failed to index paper: {e}")
+            return {
+                "status": "error",
+                "message": f"Failed to index paper: {str(e)}"
+            }
+    async def search_similar(
+        self,
+        query: str,
+        top_k: int = 5,
+        include_scores: bool = True
+    ) -> List[Dict[str, Any]]:
+        """Search for similar research in memory"""
+        if not self.index:
+            return []
+        try:
+            # Use retriever for direct vector search (no LLM needed)
+            retriever = self.index.as_retriever(
+                similarity_top_k=top_k
+            )
+            # Search using retriever
+            nodes = retriever.retrieve(query)
+            results = []
+            for node in nodes:
+                result = {
+                    "text": node.text[:500] + "..." if len(node.text) > 500 else node.text,
+                    "metadata": node.metadata,
+                    "score": node.score if include_scores else None
+                }
+                results.append(result)
+            return results
+        except Exception as e:
+            logger.error(f"Search failed: {e}")
+            return []
+# Global manager - will be initialized on first use
+memory_manager = None
+_initialization_lock = asyncio.Lock()  # Prevent race conditions during initialization
+_initialization_started = False
+async def ensure_initialized():
+    """Ensure the memory manager is initialized (lazy initialization)."""
+    global memory_manager, _initialization_started
+    # Quick check without lock
+    if memory_manager is not None:
+        return True
+    # Thread-safe initialization
+    async with _initialization_lock:
+        # Double-check after acquiring lock
+        if memory_manager is not None:
+            return True
+        if not LLAMAINDEX_AVAILABLE:
+            return False
+        if _initialization_started:
+            # Another thread is initializing, wait for it
+            while memory_manager is None and _initialization_started:
+                await asyncio.sleep(0.1)
+            return memory_manager is not None
+        try:
+            _initialization_started = True
+            logger.info("🔄 Initializing LlamaIndex RAG system (this may take 20-30 seconds)...")
+            logger.info("  Loading BioBERT embedding model...")
+            # Initialize the memory manager
+            memory_manager = ResearchMemoryManager()
+            logger.info("✅ LlamaIndex RAG system initialized successfully")
+            return True
+        except Exception as e:
+            logger.error(f"❌ Failed to initialize LlamaIndex: {e}")
+            _initialization_started = False
+            return False
+@mcp.tool()
+async def index_paper(
+    title: str,
+    abstract: str,
+    authors: str,
+    doi: Optional[str] = None,
+    journal: Optional[str] = None,
+    year: Optional[int] = None,
+    findings: Optional[str] = None,
+    url: Optional[str] = None
+) -> str:
+    """Index a research paper into persistent memory for future retrieval.
+    The agent's research memory persists across sessions, building knowledge over time.
+    Args:
+        title: Paper title
+        abstract: Paper abstract or summary
+        authors: Comma-separated list of authors
+        doi: Digital Object Identifier (optional)
+        journal: Journal or preprint server name (optional)
+        year: Publication year (optional)
+        findings: Key findings or implications (optional)
+        url: URL to paper (optional)
+    Returns:
+        Status of indexing operation
+    """
+    if not LLAMAINDEX_AVAILABLE:
+        return json.dumps({
+            "status": "error",
+            "error": "LlamaIndex not installed",
+            "message": "Install with: pip install llama-index chromadb sentence-transformers"
+        }, indent=2)
+    # Lazy initialization on first use
+    if not await ensure_initialized():
+        return json.dumps({
+            "status": "error",
+            "error": "Memory manager initialization failed",
+            "message": "Check LlamaIndex configuration and dependencies"
+        }, indent=2)
+    try:
+        # Parse authors
+        authors_list = [a.strip() for a in authors.split(",")]
+        result = await memory_manager.index_paper(
+            title=title,
+            abstract=abstract,
+            authors=authors_list,
+            doi=doi,
+            journal=journal,
+            year=year,
+            findings=findings,
+            url=url
+        )
+        if result["status"] == "success":
+            return json.dumps({
+                "status": "success",
+                "paper_id": result["paper_id"],
+                "title": result["title"],
+                "message": f"✅ Indexed into research memory. Total papers: {len(papers_metadata)}",
+                "total_papers_indexed": len(papers_metadata)
+            }, indent=2)
+        elif result["status"] == "already_indexed":
+            return json.dumps({
+                "status": "already_indexed",
+                "paper_id": result["paper_id"],
+                "title": result["title"],
+                "message": "ℹ️ Paper already in research memory",
+                "total_papers_indexed": len(papers_metadata)
+            }, indent=2)
+        else:
+            return json.dumps({"status": "error", "error": "Indexing failed", "message": result.get("message", "Unknown error")}, indent=2)
+    except Exception as e:
+        logger.error(f"Error indexing paper: {e}")
+        return json.dumps({"status": "error", "error": "Indexing error", "message": str(e)}, indent=2)
+@mcp.tool()
+async def semantic_search(
+    query: str,
+    max_results: int = 5
+) -> str:
+    """Search research memory using semantic similarity.
+    Finds papers similar to your query across all indexed research,
+    even if they don't contain exact keywords.
+    Args:
+        query: Search query (can be a question, topic, or paper abstract)
+        max_results: Maximum number of results to return (default: 5)
+    Returns:
+        Similar papers from research memory
+    """
+    if not LLAMAINDEX_AVAILABLE:
+        return json.dumps({
+            "status": "error",
+            "error": "LlamaIndex not installed",
+            "message": "Install with: pip install llama-index chromadb sentence-transformers"
+        }, indent=2)
+    # Lazy initialization on first use
+    if not await ensure_initialized():
+        return json.dumps({
+            "status": "error",
+            "error": "Memory manager initialization failed",
+            "message": "Check LlamaIndex configuration and dependencies"
+        }, indent=2)
+    if not memory_manager.index:
+        return json.dumps({
+            "status": "error",
+            "error": "No research memory available",
+            "message": "No papers have been indexed yet"
+        }, indent=2)
+    try:
+        results = await memory_manager.search_similar(
+            query=query,
+            top_k=max_results
+        )
+        if not results:
+            return json.dumps({
+                "status": "no_results",
+                "query": query,
+                "message": "No similar research found in memory"
+            }, indent=2)
+        # Format results
+        formatted_results = []
+        for i, result in enumerate(results, 1):
+            metadata = result["metadata"]
+            formatted_results.append({
+                "rank": i,
+                "title": metadata.get("title", "Unknown"),
+                "authors": metadata.get("authors", []),
+                "year": metadata.get("year"),
+                "journal": metadata.get("journal"),
+                "doi": metadata.get("doi"),
+                "url": metadata.get("url"),
+                "similarity_score": round(result["score"], 3) if result["score"] else None,
+                "excerpt": result["text"][:300] + "..."
+            })
+        return json.dumps({
+            "status": "success",
+            "query": query,
+            "num_results": len(formatted_results),
+            "results": formatted_results,
+            "message": f"Found {len(formatted_results)} similar papers in research memory"
+        }, indent=2)
+    except Exception as e:
+        logger.error(f"Search error: {e}")
+        return json.dumps({"status": "error", "error": "Search failed", "message": str(e)}, indent=2)
+@mcp.tool()
+async def get_research_connections(
+    paper_title: str,
+    connection_type: str = "similar",
+    max_connections: int = 5
+) -> str:
+    """Discover connections between research papers in memory.
+    Finds related papers that might share themes, methods, or findings.
+    Args:
+        paper_title: Title of paper to find connections for
+        connection_type: Type of connections - "similar", "citations", "authors"
+        max_connections: Maximum connections to return
+    Returns:
+        Connected papers with relationship descriptions
+    """
+    if not LLAMAINDEX_AVAILABLE:
+        return json.dumps({
+            "status": "error",
+            "error": "LlamaIndex not installed",
+            "message": "Install with: pip install llama-index chromadb sentence-transformers"
+        }, indent=2)
+    # Lazy initialization on first use
+    if not await ensure_initialized():
+        return json.dumps({
+            "status": "error",
+            "error": "Memory manager initialization failed",
+            "message": "Check LlamaIndex configuration and dependencies"
+        }, indent=2)
+    try:
+        # For now, we'll use similarity search
+        # Future: implement citation networks, co-authorship graphs
+        if connection_type == "similar":
+            # Search for papers similar to this title
+            results = await memory_manager.search_similar(
+                query=paper_title,
+                top_k=max_connections + 1  # +1 because it might include itself
+            )
+            # Filter out the paper itself
+            filtered_results = []
+            for result in results:
+                if result["metadata"].get("title", "").lower() != paper_title.lower():
+                    filtered_results.append(result)
+            if not filtered_results:
+                return json.dumps({
+                    "status": "no_connections",
+                    "paper": paper_title,
+                    "message": "No connections found in research memory"
+                }, indent=2)
+            connections = []
+            for result in filtered_results[:max_connections]:
+                metadata = result["metadata"]
+                connections.append({
+                    "title": metadata.get("title", "Unknown"),
+                    "authors": metadata.get("authors", []),
+                    "year": metadata.get("year"),
+                    "connection_strength": round(result["score"], 3) if result["score"] else None,
+                    "connection_type": "semantic_similarity",
+                    "url": metadata.get("url")
+                })
+            return json.dumps({
+                "status": "success",
+                "paper": paper_title,
+                "connection_type": connection_type,
+                "num_connections": len(connections),
+                "connections": connections,
+                "message": f"Found {len(connections)} connected papers"
+            }, indent=2)
+        else:
+            return json.dumps({
+                "status": "not_implemented",
+                "message": f"Connection type '{connection_type}' not yet implemented. Use 'similar' for now."
+            }, indent=2)
+    except Exception as e:
+        logger.error(f"Error finding connections: {e}")
+        return json.dumps({"status": "error", "error": "Connection search failed", "message": str(e)}, indent=2)
+@mcp.tool()
+async def list_indexed_papers(
+    limit: int = 20,
+    sort_by: str = "date"
+) -> str:
+    """List papers currently in research memory.
+    Shows what research the agent has learned from previously.
+    Args:
+        limit: Maximum papers to list (default: 20)
+        sort_by: Sort order - "date" (indexed date) or "year" (publication year)
+    Returns:
+        List of indexed papers with metadata
+    """
+    if not papers_metadata:
+        return json.dumps({
+            "status": "empty",
+            "message": "No papers indexed yet. Research memory is empty.",
+            "total_papers": 0
+        }, indent=2)
+    try:
+        # Get papers list
+        papers_list = []
+        for paper_id, metadata in papers_metadata.items():
+            # Convert authors string back to list
+            authors_str = metadata.get("authors", "")
+            authors_list = authors_str.split(", ") if authors_str else []
+            papers_list.append({
+                "paper_id": paper_id,
+                "title": metadata.get("title", "Unknown"),
+                "authors": authors_list,
+                "year": metadata.get("year"),
+                "journal": metadata.get("journal"),
+                "doi": metadata.get("doi"),
+                "indexed_at": metadata.get("indexed_at"),
+                "url": metadata.get("url")
+            })
+        # Sort
+        if sort_by == "date":
+            papers_list.sort(key=lambda x: x.get("indexed_at", ""), reverse=True)
+        elif sort_by == "year":
+            papers_list.sort(key=lambda x: x.get("year", 0), reverse=True)
+        # Limit
+        papers_list = papers_list[:limit]
+        return json.dumps({
+            "status": "success",
+            "total_papers": len(papers_metadata),
+            "showing": len(papers_list),
+            "sort_by": sort_by,
+            "papers": papers_list,
+            "message": f"Research memory contains {len(papers_metadata)} papers"
+        }, indent=2)
+    except Exception as e:
+        logger.error(f"Error listing papers: {e}")
+        return json.dumps({"status": "error", "error": "Failed to list papers", "message": str(e)}, indent=2)
+@mcp.tool()
+async def clear_research_memory(
+    confirm: bool = False
+) -> str:
+    """Clear all papers from research memory.
+    ⚠️ This will permanently delete all indexed research!
+    Args:
+        confirm: Must be True to actually clear memory
+    Returns:
+        Confirmation of memory clearing
+    """
+    global papers_metadata
+    if not confirm:
+        return json.dumps({
+            "status": "confirmation_required",
+            "message": "⚠️ This will delete all research memory. Set confirm=True to proceed.",
+            "current_papers": len(papers_metadata)
+        }, indent=2)
+    try:
+        # Check if memory manager needs initialization
+        # Only initialize if we have papers to clear
+        if papers_metadata and not memory_manager:
+            await ensure_initialized()
+        # Clear ChromaDB collection
+        if memory_manager and memory_manager.collection:
+            # Delete and recreate collection
+            memory_manager.chroma_client.delete_collection("als_research")
+            memory_manager.collection = memory_manager.chroma_client.create_collection("als_research")
+            # Reinitialize index
+            memory_manager._initialize_index()
+        # Clear metadata
+        num_papers = len(papers_metadata)
+        papers_metadata = {}
+        # Save empty metadata
+        if memory_manager:
+            memory_manager._save_metadata()
+        logger.info(f"Cleared research memory: {num_papers} papers removed")
+        return json.dumps({
+            "status": "success",
+            "message": f"✅ Research memory cleared. Removed {num_papers} papers.",
+            "papers_removed": num_papers
+        }, indent=2)
+    except Exception as e:
+        logger.error(f"Error clearing memory: {e}")
+        return json.dumps({"status": "error", "error": "Failed to clear memory", "message": str(e)}, indent=2)
+if __name__ == "__main__":
+    # Check for required packages
+    if not LLAMAINDEX_AVAILABLE:
+        logger.error("LlamaIndex dependencies not installed!")
+        logger.info("Install with: pip install llama-index-core llama-index-vector-stores-chroma")
+        logger.info("              pip install chromadb sentence-transformers transformers")
+    else:
+        logger.info(f"LlamaIndex RAG server starting...")
+        logger.info(f"ChromaDB path: {CHROMA_DB_PATH}")
+        logger.info(f"Embedding model: {EMBED_MODEL}")
+        logger.info(f"Papers in memory: {len(papers_metadata)}")
+    # Run the MCP server
+    mcp.run(transport="stdio")

servers/pubmed_server.py ADDED Viewed

	@@ -0,0 +1,269 @@

+# pubmed_server.py
+from mcp.server.fastmcp import FastMCP
+import httpx
+import logging
+import sys
+from pathlib import Path
+# Add parent directory to path for shared imports
+sys.path.insert(0, str(Path(__file__).parent.parent))
+from shared import (
+    config,
+    RateLimiter,
+    format_authors,
+    ErrorFormatter,
+    truncate_text
+)
+from shared.http_client import get_http_client
+# Configure logging
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+# Create FastMCP server
+mcp = FastMCP("pubmed-server")
+# Rate limiting using shared utility
+rate_limiter = RateLimiter(config.rate_limits.pubmed_delay)
+@mcp.tool()
+async def search_pubmed(
+    query: str,
+    max_results: int = 10,
+    sort: str = "relevance"
+) -> str:
+    """Search PubMed for ALS research papers. Returns titles, abstracts, PMIDs, and publication dates.
+    Args:
+        query: Search query (e.g., 'ALS SOD1 therapy')
+        max_results: Maximum number of results (default: 10)
+        sort: Sort order - 'relevance' or 'date' (default: 'relevance')
+    """
+    try:
+        logger.info(f"Searching PubMed for: {query}")
+        # PubMed E-utilities API (no auth required)
+        base_url = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils"
+        # Rate limiting
+        await rate_limiter.wait()
+        # Step 1: Search for PMIDs
+        search_params = {
+            "db": "pubmed",
+            "term": query,
+            "retmax": max_results,
+            "retmode": "json",
+            "sort": sort
+        }
+        # Use shared HTTP client for connection pooling
+        client = get_http_client(timeout=config.api.timeout)
+        # Get PMIDs
+        search_resp = await client.get(f"{base_url}/esearch.fcgi", params=search_params)
+        search_resp.raise_for_status()
+        search_data = search_resp.json()
+        pmids = search_data.get("esearchresult", {}).get("idlist", [])
+        if not pmids:
+            logger.info(f"No results found for query: {query}")
+            return ErrorFormatter.no_results(query)
+        # Rate limiting
+        await rate_limiter.wait()
+        # Step 2: Fetch details for PMIDs
+        fetch_params = {
+            "db": "pubmed",
+            "id": ",".join(pmids),
+            "retmode": "xml"
+        }
+        fetch_resp = await client.get(f"{base_url}/efetch.fcgi", params=fetch_params)
+        fetch_resp.raise_for_status()
+        # Parse XML and extract key info
+        papers = parse_pubmed_xml(fetch_resp.text)
+        result = f"Found {len(papers)} papers for query: '{query}'\n\n"
+        for i, paper in enumerate(papers, 1):
+            result += f"{i}. **{paper['title']}**\n"
+            result += f"   PMID: {paper['pmid']} | Published: {paper['date']}\n"
+            result += f"   Authors: {paper['authors']}\n"
+            result += f"   URL: https://pubmed.ncbi.nlm.nih.gov/{paper['pmid']}/\n"
+            result += f"   Abstract: {truncate_text(paper['abstract'], max_chars=300, suffix='')}...\n\n"
+        logger.info(f"Successfully retrieved {len(papers)} papers")
+        return result
+    except httpx.TimeoutException:
+        logger.error("PubMed API request timed out")
+        return "Error: PubMed API request timed out. Please try again."
+    except httpx.HTTPStatusError as e:
+        logger.error(f"PubMed API error: {e}")
+        return f"Error: PubMed API returned status {e.response.status_code}"
+    except Exception as e:
+        logger.error(f"Unexpected error in search_pubmed: {e}")
+        return f"Error: {str(e)}"
+@mcp.tool()
+async def get_paper_details(pmid: str) -> str:
+    """Get full details for a specific PubMed paper by PMID.
+    Args:
+        pmid: PubMed ID
+    """
+    try:
+        logger.info(f"Fetching details for PMID: {pmid}")
+        base_url = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils"
+        # Rate limiting
+        await rate_limiter.wait()
+        fetch_params = {
+            "db": "pubmed",
+            "id": pmid,
+            "retmode": "xml"
+        }
+        # Use shared HTTP client for connection pooling
+        client = get_http_client(timeout=config.api.timeout)
+        fetch_resp = await client.get(f"{base_url}/efetch.fcgi", params=fetch_params)
+        fetch_resp.raise_for_status()
+        papers = parse_pubmed_xml(fetch_resp.text)
+        if not papers:
+            return ErrorFormatter.not_found("paper", pmid)
+        paper = papers[0]
+        # Format detailed response
+        result = f"**{paper['title']}**\n\n"
+        result += f"**PMID:** {paper['pmid']}\n"
+        result += f"**Published:** {paper['date']}\n"
+        result += f"**Authors:** {paper['authors']}\n\n"
+        result += f"**Abstract:**\n{paper['abstract']}\n\n"
+        result += f"**Journal:** {paper.get('journal', 'N/A')}\n"
+        result += f"**DOI:** {paper.get('doi', 'N/A')}\n"
+        result += f"**PubMed URL:** https://pubmed.ncbi.nlm.nih.gov/{pmid}/\n"
+        logger.info(f"Successfully retrieved details for PMID: {pmid}")
+        return result
+    except httpx.TimeoutException:
+        logger.error("PubMed API request timed out")
+        return "Error: PubMed API request timed out. Please try again."
+    except httpx.HTTPStatusError as e:
+        logger.error(f"PubMed API error: {e}")
+        return f"Error: PubMed API returned status {e.response.status_code}"
+    except Exception as e:
+        logger.error(f"Unexpected error in get_paper_details: {e}")
+        return f"Error: {str(e)}"
+def parse_pubmed_xml(xml_text: str) -> list[dict]:
+    """Parse PubMed XML response into structured data with error handling"""
+    import xml.etree.ElementTree as ET
+    papers = []
+    try:
+        root = ET.fromstring(xml_text)
+    except ET.ParseError as e:
+        logger.error(f"XML parsing error: {e}")
+        return papers
+    for article in root.findall(".//PubmedArticle"):
+        try:
+            # Extract title
+            title_elem = article.find(".//ArticleTitle")
+            title = "".join(title_elem.itertext()) if title_elem is not None else "No title"
+            # Extract abstract (may have multiple AbstractText elements)
+            abstract_parts = []
+            for abstract_elem in article.findall(".//AbstractText"):
+                if abstract_elem is not None and abstract_elem.text:
+                    label = abstract_elem.get("Label", "")
+                    text = "".join(abstract_elem.itertext())
+                    if label:
+                        abstract_parts.append(f"{label}: {text}")
+                    else:
+                        abstract_parts.append(text)
+            abstract = " ".join(abstract_parts) if abstract_parts else "No abstract available"
+            # Extract PMID
+            pmid_elem = article.find(".//PMID")
+            pmid = pmid_elem.text if pmid_elem is not None else "Unknown"
+            # Extract date - correct path in MedlineCitation
+            pub_date = article.find(".//MedlineCitation/Article/Journal/JournalIssue/PubDate")
+            if pub_date is not None:
+                year_elem = pub_date.find("Year")
+                month_elem = pub_date.find("Month")
+                year = year_elem.text if year_elem is not None else "Unknown"
+                month = month_elem.text if month_elem is not None else ""
+                date_str = f"{month} {year}" if month else year
+            else:
+                # Try alternative date location
+                date_completed = article.find(".//DateCompleted")
+                if date_completed is not None:
+                    year_elem = date_completed.find("Year")
+                    year = year_elem.text if year_elem is not None else "Unknown"
+                    date_str = year
+                else:
+                    date_str = "Unknown"
+            # Extract authors
+            authors = []
+            for author in article.findall(".//Author"):
+                last = author.find("LastName")
+                first = author.find("ForeName")
+                collective = author.find("CollectiveName")
+                if collective is not None and collective.text:
+                    authors.append(collective.text)
+                elif last is not None and first is not None:
+                    authors.append(f"{first.text} {last.text}")
+                elif last is not None:
+                    authors.append(last.text)
+            # Format authors using shared utility
+            authors_str = format_authors("; ".join(authors), max_authors=3) if authors else "Unknown authors"
+            # Extract journal name
+            journal_elem = article.find(".//Journal/Title")
+            journal = journal_elem.text if journal_elem is not None else "Unknown"
+            # Extract DOI
+            doi = None
+            for article_id in article.findall(".//ArticleId"):
+                if article_id.get("IdType") == "doi":
+                    doi = article_id.text
+                    break
+            papers.append({
+                "title": title,
+                "abstract": abstract,
+                "pmid": pmid,
+                "date": date_str,
+                "authors": authors_str,
+                "journal": journal,
+                "doi": doi or "N/A"
+            })
+        except Exception as e:
+            logger.warning(f"Error parsing article: {e}")
+            continue
+    return papers
+if __name__ == "__main__":
+    # Run with stdio transport
+    mcp.run(transport="stdio")

shared/__init__.py ADDED Viewed

	@@ -0,0 +1,34 @@

+# shared/__init__.py
+"""Shared utilities and configuration for ALS Research Agent"""
+from .config import config, AppConfig, APIConfig, RateLimitConfig, ContentLimits, SecurityConfig
+from .utils import (
+    RateLimiter,
+    safe_api_call,
+    truncate_text,
+    format_authors,
+    clean_whitespace,
+    ErrorFormatter,
+    create_citation
+)
+from .cache import SimpleCache
+__all__ = [
+    # Configuration
+    'config',
+    'AppConfig',
+    'APIConfig',
+    'RateLimitConfig',
+    'ContentLimits',
+    'SecurityConfig',
+    # Utilities
+    'RateLimiter',
+    'safe_api_call',
+    'truncate_text',
+    'format_authors',
+    'clean_whitespace',
+    'ErrorFormatter',
+    'create_citation',
+    # Cache
+    'SimpleCache',
+]

shared/cache.py ADDED Viewed

	@@ -0,0 +1,94 @@

+# shared/cache.py
+"""Simple in-memory cache for API responses"""
+import time
+import hashlib
+import json
+from typing import Optional, Any
+import logging
+logger = logging.getLogger(__name__)
+class SimpleCache:
+    """Simple TTL-based in-memory cache with size limits"""
+    def __init__(self, ttl: int = 3600, max_size: int = 100):
+        """
+        Initialize cache with TTL and size limits
+        Args:
+            ttl: Time to live in seconds (default: 1 hour)
+            max_size: Maximum number of cached entries (default: 100)
+        """
+        self.cache = {}
+        self.ttl = ttl
+        self.max_size = max_size
+    def _make_key(self, tool_name: str, arguments: dict) -> str:
+        """Create cache key from tool name and arguments"""
+        # Sort dict for consistent hashing
+        args_str = json.dumps(arguments, sort_keys=True)
+        key_str = f"{tool_name}:{args_str}"
+        return hashlib.md5(key_str.encode()).hexdigest()
+    def get(self, tool_name: str, arguments: dict) -> Optional[str]:
+        """Get cached result if available and not expired"""
+        key = self._make_key(tool_name, arguments)
+        if key in self.cache:
+            result, timestamp = self.cache[key]
+            # Check if expired
+            if time.time() - timestamp < self.ttl:
+                logger.info(f"Cache HIT for {tool_name}")
+                return result
+            else:
+                # Remove expired entry
+                del self.cache[key]
+                logger.info(f"Cache EXPIRED for {tool_name}")
+        logger.info(f"Cache MISS for {tool_name}")
+        return None
+    def set(self, tool_name: str, arguments: dict, result: str) -> None:
+        """Store result in cache with LRU eviction if at capacity"""
+        key = self._make_key(tool_name, arguments)
+        # Check if we need to evict an entry
+        if len(self.cache) >= self.max_size and key not in self.cache:
+            # Find and remove oldest entry (LRU based on timestamp)
+            if self.cache:  # Safety check
+                oldest_key = min(self.cache.keys(),
+                               key=lambda k: self.cache[k][1])
+                del self.cache[oldest_key]
+                logger.debug(f"Evicted oldest cache entry to maintain size limit")
+        self.cache[key] = (result, time.time())
+        logger.debug(f"Cached result for {tool_name} (cache size: {len(self.cache)}/{self.max_size})")
+    def clear(self) -> None:
+        """Clear all cache entries"""
+        self.cache.clear()
+        logger.info("Cache cleared")
+    def size(self) -> int:
+        """Get number of cached items"""
+        return len(self.cache)
+    def cleanup_expired(self) -> int:
+        """Remove all expired entries and return count of removed items"""
+        expired_keys = []
+        current_time = time.time()
+        for key, (result, timestamp) in self.cache.items():
+            if current_time - timestamp >= self.ttl:
+                expired_keys.append(key)
+        for key in expired_keys:
+            del self.cache[key]
+        if expired_keys:
+            logger.info(f"Cleaned up {len(expired_keys)} expired cache entries")
+        return len(expired_keys)

shared/config.py ADDED Viewed

	@@ -0,0 +1,134 @@

+# shared/config.py
+"""Shared configuration for MCP servers"""
+import os
+from dataclasses import dataclass
+from typing import Optional
+@dataclass
+class APIConfig:
+    """Configuration for API calls"""
+    timeout: float = 15.0  # Reduced from 30s - PubMed typically responds in <1s
+    max_retries: int = 3
+    user_agent: str = "Mozilla/5.0 (compatible; ALS-Research-Bot/1.0)"
+@dataclass
+class RateLimitConfig:
+    """Rate limiting configuration for different APIs"""
+    # PubMed: 3 req/sec without key, 10 req/sec with key
+    pubmed_delay: float = 0.34  # ~3 requests per second
+    # ClinicalTrials.gov: conservative limit (API limit is ~50 req/min)
+    clinicaltrials_delay: float = 1.5  # ~40 requests per minute (safe margin)
+    # bioRxiv/medRxiv: be respectful
+    biorxiv_delay: float = 1.0  # 1 request per second
+    # General web fetching
+    fetch_delay: float = 0.5
+@dataclass
+class ContentLimits:
+    """Content size and length limits"""
+    # Maximum content size for downloads (10MB)
+    max_content_size: int = 10 * 1024 * 1024
+    # Maximum characters for LLM context
+    max_text_chars: int = 8000
+    # Maximum abstract preview length
+    max_abstract_preview: int = 300
+    # Maximum description preview length
+    max_description_preview: int = 500
+@dataclass
+class SecurityConfig:
+    """Security-related configuration"""
+    allowed_schemes: list[str] = None
+    blocked_hosts: list[str] = None
+    def __post_init__(self):
+        if self.allowed_schemes is None:
+            self.allowed_schemes = ['http', 'https']
+        if self.blocked_hosts is None:
+            self.blocked_hosts = [
+                'localhost',
+                '127.0.0.1',
+                '0.0.0.0',
+                '[::1]'
+            ]
+    def is_private_ip(self, hostname: str) -> bool:
+        """Check if hostname is a private IP"""
+        hostname_lower = hostname.lower()
+        # Check exact matches
+        if hostname_lower in self.blocked_hosts:
+            return True
+        # Check private IP ranges
+        if hostname_lower.startswith(('192.168.', '10.')):
+            return True
+        # Check 172.16-31 range
+        if hostname_lower.startswith('172.'):
+            try:
+                second_octet = int(hostname.split('.')[1])
+                if 16 <= second_octet <= 31:
+                    return True
+            except (ValueError, IndexError):
+                pass
+        return False
+@dataclass
+class AppConfig:
+    """Application-wide configuration"""
+    # API configurations
+    api: APIConfig = None
+    rate_limits: RateLimitConfig = None
+    content_limits: ContentLimits = None
+    security: SecurityConfig = None
+    # Environment variables
+    anthropic_api_key: Optional[str] = None
+    anthropic_model: str = "claude-sonnet-4-5-20250929"
+    gradio_port: int = 7860
+    log_level: str = "INFO"
+    # PubMed email (optional, increases rate limit)
+    pubmed_email: Optional[str] = None
+    def __post_init__(self):
+        # Initialize sub-configs
+        if self.api is None:
+            self.api = APIConfig()
+        if self.rate_limits is None:
+            self.rate_limits = RateLimitConfig()
+        if self.content_limits is None:
+            self.content_limits = ContentLimits()
+        if self.security is None:
+            self.security = SecurityConfig()
+        # Load from environment
+        self.anthropic_api_key = os.getenv("ANTHROPIC_API_KEY", self.anthropic_api_key)
+        self.anthropic_model = os.getenv("ANTHROPIC_MODEL", self.anthropic_model)
+        self.gradio_port = int(os.getenv("GRADIO_SERVER_PORT", self.gradio_port))
+        self.log_level = os.getenv("LOG_LEVEL", self.log_level)
+        self.pubmed_email = os.getenv("PUBMED_EMAIL", self.pubmed_email)
+    @classmethod
+    def from_env(cls) -> 'AppConfig':
+        """Create configuration from environment variables"""
+        return cls()
+# Global configuration instance
+config = AppConfig.from_env()

shared/http_client.py ADDED Viewed

	@@ -0,0 +1,68 @@

+#!/usr/bin/env python3
+"""
+Shared HTTP client with connection pooling for better performance.
+All MCP servers should use this instead of creating new clients for each request.
+"""
+import httpx
+from typing import Optional
+# Global HTTP client with connection pooling
+# This maintains persistent connections to servers for faster subsequent requests
+_http_client: Optional[httpx.AsyncClient] = None
+def get_http_client(timeout: float = 30.0) -> httpx.AsyncClient:
+    """
+    Get the shared HTTP client with connection pooling.
+    NOTE: For different timeout values, use CustomHTTPClient context manager
+    instead to avoid conflicts between servers.
+    Args:
+        timeout: Request timeout in seconds (default 30)
+    Returns:
+        Shared httpx.AsyncClient instance
+    """
+    global _http_client
+    if _http_client is None or _http_client.is_closed:
+        _http_client = httpx.AsyncClient(
+            timeout=httpx.Timeout(timeout),
+            limits=httpx.Limits(
+                max_connections=100,      # Maximum number of connections
+                max_keepalive_connections=20,  # Keep 20 connections alive for reuse
+                keepalive_expiry=300      # Keep connections alive for 5 minutes
+            ),
+            # Follow redirects by default
+            follow_redirects=True
+        )
+    return _http_client
+async def close_http_client():
+    """Close the shared HTTP client (call on shutdown)."""
+    global _http_client
+    if _http_client and not _http_client.is_closed:
+        await _http_client.aclose()
+        _http_client = None
+# Context manager for temporary clients with custom settings
+class CustomHTTPClient:
+    """Context manager for creating temporary HTTP clients with custom settings."""
+    def __init__(self, timeout: float = 30.0, **kwargs):
+        self.timeout = timeout
+        self.kwargs = kwargs
+        self.client = None
+    async def __aenter__(self):
+        self.client = httpx.AsyncClient(
+            timeout=httpx.Timeout(self.timeout),
+            **self.kwargs
+        )
+        return self.client
+    async def __aexit__(self, exc_type, exc_val, exc_tb):
+        if self.client:
+            await self.client.aclose()

shared/utils.py ADDED Viewed

	@@ -0,0 +1,194 @@

+# shared/utils.py
+"""Shared utilities for MCP servers"""
+import asyncio
+import time
+import logging
+from typing import Optional, Callable, Any
+from mcp.types import TextContent
+import httpx
+logger = logging.getLogger(__name__)
+class RateLimiter:
+    """Rate limiter for API calls"""
+    def __init__(self, delay: float):
+        """
+        Initialize rate limiter
+        Args:
+            delay: Minimum delay between requests in seconds
+        """
+        self.delay = delay
+        self.last_request_time: Optional[float] = None
+    async def wait(self) -> None:
+        """Wait if necessary to respect rate limit"""
+        if self.last_request_time is not None:
+            elapsed = time.time() - self.last_request_time
+            if elapsed < self.delay:
+                await asyncio.sleep(self.delay - elapsed)
+        self.last_request_time = time.time()
+async def safe_api_call(
+    func: Callable,
+    *args: Any,
+    timeout: float = 30.0,
+    error_prefix: str = "API",
+    **kwargs: Any
+) -> list[TextContent]:
+    """
+    Safely execute an API call with comprehensive error handling
+    Args:
+        func: Async function to call
+        *args: Positional arguments for func
+        timeout: Timeout in seconds
+        error_prefix: Prefix for error messages
+        **kwargs: Keyword arguments for func
+    Returns:
+        list[TextContent]: Result or error message
+    """
+    try:
+        return await asyncio.wait_for(func(*args, **kwargs), timeout=timeout)
+    except asyncio.TimeoutError:
+        logger.error(f"{error_prefix} request timed out after {timeout}s")
+        return [TextContent(
+            type="text",
+            text=f"Error: {error_prefix} request timed out after {timeout} seconds. Please try again."
+        )]
+    except httpx.TimeoutException:
+        logger.error(f"{error_prefix} request timed out")
+        return [TextContent(
+            type="text",
+            text=f"Error: {error_prefix} request timed out. Please try again."
+        )]
+    except httpx.HTTPStatusError as e:
+        logger.error(f"{error_prefix} error: HTTP {e.response.status_code}")
+        return [TextContent(
+            type="text",
+            text=f"Error: {error_prefix} returned status {e.response.status_code}"
+        )]
+    except httpx.RequestError as e:
+        logger.error(f"{error_prefix} request error: {e}")
+        return [TextContent(
+            type="text",
+            text=f"Error: Failed to connect to {error_prefix}. Please check your connection."
+        )]
+    except Exception as e:
+        logger.error(f"Unexpected error in {error_prefix}: {e}", exc_info=True)
+        return [TextContent(
+            type="text",
+            text=f"Error: {str(e)}"
+        )]
+def truncate_text(text: str, max_chars: int = 8000, suffix: str = "...") -> str:
+    """
+    Truncate text to maximum length with suffix
+    Args:
+        text: Text to truncate
+        max_chars: Maximum character count
+        suffix: Suffix to add when truncated
+    Returns:
+        Truncated text
+    """
+    if len(text) <= max_chars:
+        return text
+    return text[:max_chars] + f"\n\n[Content truncated at {max_chars} characters]{suffix}"
+def format_authors(authors: str, max_authors: int = 3) -> str:
+    """
+    Format author list with et al. if needed
+    Args:
+        authors: Semicolon-separated author list
+        max_authors: Maximum authors to show
+    Returns:
+        Formatted author string
+    """
+    if not authors or authors == "Unknown":
+        return "Unknown authors"
+    author_list = [a.strip() for a in authors.split(";")]
+    if len(author_list) <= max_authors:
+        return ", ".join(author_list)
+    return ", ".join(author_list[:max_authors]) + " et al."
+def clean_whitespace(text: str) -> str:
+    """
+    Clean up excessive whitespace in text
+    Args:
+        text: Text to clean
+    Returns:
+        Cleaned text
+    """
+    lines = (line.strip() for line in text.splitlines())
+    chunks = (phrase.strip() for line in lines for phrase in line.split("  "))
+    return '\n'.join(chunk for chunk in chunks if chunk)
+class ErrorFormatter:
+    """Consistent error message formatting"""
+    @staticmethod
+    def not_found(resource_type: str, identifier: str) -> str:
+        """Format not found error"""
+        return f"No {resource_type} found with identifier: {identifier}"
+    @staticmethod
+    def no_results(query: str, time_period: str = "") -> str:
+        """Format no results error"""
+        time_str = f" {time_period}" if time_period else ""
+        return f"No results found for query: {query}{time_str}"
+    @staticmethod
+    def validation_error(field: str, issue: str) -> str:
+        """Format validation error"""
+        return f"Validation error: {field} - {issue}"
+    @staticmethod
+    def api_error(service: str, status_code: int) -> str:
+        """Format API error"""
+        return f"Error: {service} API returned status {status_code}"
+def create_citation(
+    identifier: str,
+    identifier_type: str,
+    url: Optional[str] = None
+) -> str:
+    """
+    Create a formatted citation string
+    Args:
+        identifier: Citation identifier (PMID, DOI, NCT ID)
+        identifier_type: Type of identifier
+        url: Optional URL
+    Returns:
+        Formatted citation
+    """
+    citation = f"{identifier_type}: {identifier}"
+    if url:
+        citation += f" | URL: {url}"
+    return citation

smart_cache.py ADDED Viewed

	@@ -0,0 +1,458 @@

+#!/usr/bin/env python3
+"""
+Smart Cache System for ALS Research Agent
+Features:
+- Query normalization to match similar queries
+- Cache pre-warming with common queries
+- High-frequency question optimization
+"""
+import json
+import hashlib
+import re
+from typing import Dict, List, Optional, Tuple, Any
+from datetime import datetime, timedelta
+import asyncio
+import logging
+logger = logging.getLogger(__name__)
+class SmartCache:
+    """Advanced caching system with query normalization and pre-warming"""
+    def __init__(self, cache_dir: str = ".cache", ttl_hours: int = 24):
+        """
+        Initialize smart cache system.
+        Args:
+            cache_dir: Directory for cache storage
+            ttl_hours: Time-to-live for cached entries in hours
+        """
+        self.cache_dir = cache_dir
+        self.ttl = timedelta(hours=ttl_hours)
+        self.cache = {}  # In-memory cache
+        self.normalized_cache = {}  # Maps normalized queries to original cache keys
+        self.high_frequency_queries = {}  # User-specified common queries
+        self.query_stats = {}  # Track query frequency
+        # Ensure cache directory exists
+        import os
+        os.makedirs(cache_dir, exist_ok=True)
+        # Load persistent cache on init
+        self.load_cache()
+    def normalize_query(self, query: str) -> str:
+        """
+        Normalize query for better cache matching.
+        Handles variations like:
+        - "ALS gene therapy" vs "gene therapy ALS"
+        - "What are the latest trials" vs "what are latest trials"
+        - Different word orders, case, punctuation
+        """
+        # Convert to lowercase
+        normalized = query.lower().strip()
+        # Remove common question words that don't affect meaning
+        question_words = [
+            'what', 'how', 'when', 'where', 'why', 'who', 'which',
+            'are', 'is', 'the', 'a', 'an', 'there', 'can', 'could',
+            'would', 'should', 'do', 'does', 'did', 'have', 'has', 'had'
+        ]
+        # Remove punctuation
+        normalized = re.sub(r'[^\w\s]', ' ', normalized)
+        # Split into words and remove question words
+        words = normalized.split()
+        content_words = [w for w in words if w not in question_words]
+        # Sort words alphabetically for consistent ordering
+        # This makes "ALS gene therapy" match "gene therapy ALS"
+        content_words.sort()
+        # Join back together
+        normalized = ' '.join(content_words)
+        # Remove extra whitespace
+        normalized = ' '.join(normalized.split())
+        return normalized
+    def generate_cache_key(self, query: str, include_normalization: bool = True) -> str:
+        """
+        Generate a cache key for a query.
+        Args:
+            query: The original query
+            include_normalization: Whether to also store normalized version
+        Returns:
+            Hash-based cache key
+        """
+        # Generate hash of original query
+        original_hash = hashlib.sha256(query.encode()).hexdigest()[:16]
+        if include_normalization:
+            # Also store mapping from normalized query to this cache key
+            normalized = self.normalize_query(query)
+            normalized_hash = hashlib.sha256(normalized.encode()).hexdigest()[:16]
+            # Store mapping for future lookups
+            if normalized_hash not in self.normalized_cache:
+                self.normalized_cache[normalized_hash] = []
+            if original_hash not in self.normalized_cache[normalized_hash]:
+                self.normalized_cache[normalized_hash].append(original_hash)
+        return original_hash
+    def find_similar_cached(self, query: str) -> Optional[Dict[str, Any]]:
+        """
+        Find cached results for similar queries.
+        Args:
+            query: The query to search for
+        Returns:
+            Cached result if found, None otherwise
+        """
+        # First try exact match
+        exact_key = self.generate_cache_key(query, include_normalization=False)
+        if exact_key in self.cache:
+            entry = self.cache[exact_key]
+            if self._is_valid(entry):
+                logger.info(f"Cache hit (exact): {query[:50]}...")
+                self._update_stats(query)
+                return entry['result']
+        # Try normalized match
+        normalized = self.normalize_query(query)
+        normalized_key = hashlib.sha256(normalized.encode()).hexdigest()[:16]
+        if normalized_key in self.normalized_cache:
+            # Check all original queries that normalize to this
+            for original_key in self.normalized_cache[normalized_key]:
+                if original_key in self.cache:
+                    entry = self.cache[original_key]
+                    if self._is_valid(entry):
+                        logger.info(f"Cache hit (normalized): {query[:50]}...")
+                        self._update_stats(query)
+                        return entry['result']
+        logger.info(f"Cache miss: {query[:50]}...")
+        return None
+    def store(self, query: str, result: Any, metadata: Optional[Dict] = None):
+        """
+        Store a query result in cache.
+        Args:
+            query: The original query
+            result: The result to cache
+            metadata: Optional metadata about the result
+        """
+        cache_key = self.generate_cache_key(query, include_normalization=True)
+        entry = {
+            'query': query,
+            'result': result,
+            'timestamp': datetime.now().isoformat(),
+            'metadata': metadata or {},
+            'access_count': 0
+        }
+        self.cache[cache_key] = entry
+        self._update_stats(query)
+        # Persist to disk asynchronously (non-blocking)
+        asyncio.create_task(self._save_cache_async())
+        logger.info(f"Cached result for: {query[:50]}...")
+    def _is_valid(self, entry: Dict) -> bool:
+        """Check if a cache entry is still valid (not expired)"""
+        try:
+            timestamp = datetime.fromisoformat(entry['timestamp'])
+            age = datetime.now() - timestamp
+            return age < self.ttl
+        except:
+            return False
+    def _update_stats(self, query: str):
+        """Update query frequency statistics"""
+        normalized = self.normalize_query(query)
+        if normalized not in self.query_stats:
+            self.query_stats[normalized] = {'count': 0, 'last_access': None}
+        self.query_stats[normalized]['count'] += 1
+        self.query_stats[normalized]['last_access'] = datetime.now().isoformat()
+    async def pre_warm_cache(self, queries: List[Dict[str, Any]],
+                            search_func=None, llm_func=None):
+        """
+        Pre-warm cache with common queries.
+        Args:
+            queries: List of dicts with 'query', 'search_terms', 'use_claude' keys
+            search_func: Async function to perform searches
+            llm_func: Async function to call Claude for high-priority queries
+        """
+        logger.info(f"Pre-warming cache with {len(queries)} queries...")
+        for query_config in queries:
+            query = query_config['query']
+            # Check if already cached
+            if self.find_similar_cached(query):
+                logger.info(f"Already cached: {query}")
+                continue
+            try:
+                # Use optimized search terms if provided
+                search_terms = query_config.get('search_terms', query)
+                use_claude = query_config.get('use_claude', False)
+                if search_func:
+                    # Perform search with optimized terms
+                    logger.info(f"Pre-warming: {query}")
+                    if use_claude and llm_func:
+                        # Use Claude for high-priority queries
+                        result = await llm_func(search_terms)
+                    else:
+                        # Use standard search
+                        result = await search_func(search_terms)
+                    # Cache the result
+                    self.store(query, result, {
+                        'pre_warmed': True,
+                        'optimized_terms': search_terms,
+                        'used_claude': use_claude
+                    })
+                    # Small delay to avoid overwhelming APIs
+                    await asyncio.sleep(1)
+            except Exception as e:
+                logger.error(f"Failed to pre-warm cache for '{query}': {e}")
+    def add_high_frequency_query(self, query: str, config: Dict[str, Any]):
+        """
+        Add a high-frequency query configuration.
+        Args:
+            query: The query pattern
+            config: Configuration dict with search_terms, use_claude, etc.
+        """
+        normalized = self.normalize_query(query)
+        self.high_frequency_queries[normalized] = {
+            'original': query,
+            'config': config,
+            'added': datetime.now().isoformat()
+        }
+        logger.info(f"Added high-frequency query: {query}")
+    def get_high_frequency_config(self, query: str) -> Optional[Dict[str, Any]]:
+        """
+        Get configuration for a high-frequency query if it matches.
+        Args:
+            query: The query to check
+        Returns:
+            Configuration dict if this is a high-frequency query
+        """
+        normalized = self.normalize_query(query)
+        if normalized in self.high_frequency_queries:
+            return self.high_frequency_queries[normalized]['config']
+        return None
+    def get_cache_stats(self) -> Dict[str, Any]:
+        """Get cache statistics"""
+        valid_entries = sum(1 for entry in self.cache.values() if self._is_valid(entry))
+        total_entries = len(self.cache)
+        # Get top queries
+        top_queries = sorted(
+            self.query_stats.items(),
+            key=lambda x: x[1]['count'],
+            reverse=True
+        )[:10]
+        return {
+            'total_entries': total_entries,
+            'valid_entries': valid_entries,
+            'expired_entries': total_entries - valid_entries,
+            'normalized_groups': len(self.normalized_cache),
+            'high_frequency_queries': len(self.high_frequency_queries),
+            'top_queries': [
+                {'query': q, 'count': stats['count']}
+                for q, stats in top_queries
+            ]
+        }
+    def clear_expired(self):
+        """Remove expired entries from cache"""
+        expired_keys = [
+            key for key, entry in self.cache.items()
+            if not self._is_valid(entry)
+        ]
+        for key in expired_keys:
+            del self.cache[key]
+        if expired_keys:
+            logger.info(f"Cleared {len(expired_keys)} expired cache entries")
+            self.save_cache()
+    def save_cache(self):
+        """Persist cache to disk"""
+        cache_file = f"{self.cache_dir}/smart_cache.json"
+        try:
+            with open(cache_file, 'w') as f:
+                json.dump({
+                    'cache': self.cache,
+                    'normalized_cache': self.normalized_cache,
+                    'high_frequency_queries': self.high_frequency_queries,
+                    'query_stats': self.query_stats
+                }, f, indent=2)
+            logger.debug(f"Cache saved to {cache_file}")
+        except Exception as e:
+            logger.error(f"Failed to save cache: {e}")
+    async def _save_cache_async(self):
+        """Async version of save_cache that doesn't block"""
+        try:
+            await asyncio.to_thread(self.save_cache)
+        except Exception as e:
+            logger.error(f"Failed to save cache asynchronously: {e}")
+    def load_cache(self):
+        """Load cache from disk"""
+        cache_file = f"{self.cache_dir}/smart_cache.json"
+        try:
+            with open(cache_file, 'r') as f:
+                data = json.load(f)
+                self.cache = data.get('cache', {})
+                self.normalized_cache = data.get('normalized_cache', {})
+                self.high_frequency_queries = data.get('high_frequency_queries', {})
+                self.query_stats = data.get('query_stats', {})
+            # Clear expired entries on load
+            self.clear_expired()
+            logger.info(f"Loaded cache with {len(self.cache)} entries")
+        except FileNotFoundError:
+            logger.info("No existing cache file found")
+        except Exception as e:
+            logger.error(f"Failed to load cache: {e}")
+# Configuration for common ALS queries to pre-warm
+DEFAULT_PREWARM_QUERIES = [
+    {
+        'query': 'What are the latest ALS treatments?',
+        'search_terms': 'ALS treatment therapy 2024 riluzole edaravone',
+        'use_claude': True  # High-frequency, use Claude for best results
+    },
+    {
+        'query': 'Gene therapy for ALS',
+        'search_terms': 'ALS gene therapy SOD1 C9orf72 clinical trial',
+        'use_claude': True
+    },
+    {
+        'query': 'ALS clinical trials',
+        'search_terms': 'ALS clinical trials recruiting phase 2 phase 3',
+        'use_claude': False
+    },
+    {
+        'query': 'What causes ALS?',
+        'search_terms': 'ALS etiology pathogenesis genetic environmental factors',
+        'use_claude': True
+    },
+    {
+        'query': 'ALS symptoms and diagnosis',
+        'search_terms': 'ALS symptoms diagnosis EMG criteria El Escorial',
+        'use_claude': False
+    },
+    {
+        'query': 'Stem cell therapy for ALS',
+        'search_terms': 'ALS stem cell therapy mesenchymal clinical trial',
+        'use_claude': False
+    },
+    {
+        'query': 'ALS prognosis and life expectancy',
+        'search_terms': 'ALS prognosis survival life expectancy factors',
+        'use_claude': True
+    },
+    {
+        'query': 'New ALS drugs',
+        'search_terms': 'ALS new drugs FDA approved pipeline 2024',
+        'use_claude': False
+    },
+    {
+        'query': 'ALS biomarkers',
+        'search_terms': 'ALS biomarkers neurofilament TDP-43 diagnostic prognostic',
+        'use_claude': False
+    },
+    {
+        'query': 'Is there a cure for ALS?',
+        'search_terms': 'ALS cure breakthrough research treatment advances',
+        'use_claude': True
+    }
+]
+def test_smart_cache():
+    """Test the smart cache functionality"""
+    print("Testing Smart Cache System")
+    print("=" * 60)
+    cache = SmartCache()
+    # Test query normalization
+    test_queries = [
+        ("What are the latest ALS gene therapy trials?", "ALS gene therapy trials"),
+        ("gene therapy ALS", "ALS gene therapy"),
+        ("What is ALS?", "ALS"),
+        ("HOW does riluzole work for ALS?", "ALS riluzole work"),
+    ]
+    print("\n1. Query Normalization Tests:")
+    for original, expected_words in test_queries:
+        normalized = cache.normalize_query(original)
+        print(f"  Original: {original}")
+        print(f"  Normalized: {normalized}")
+        print(f"  Expected words present: {all(w in normalized for w in expected_words.lower().split())}")
+        print()
+    # Test similar query matching
+    print("\n2. Similar Query Matching:")
+    cache.store("What are the latest ALS treatments?", {"result": "Treatment data"})
+    similar_queries = [
+        "latest ALS treatments",
+        "ALS latest treatments",
+        "What are latest treatments for ALS?",
+        "treatments ALS latest"
+    ]
+    for query in similar_queries:
+        result = cache.find_similar_cached(query)
+        print(f"  Query: {query}")
+        print(f"  Found: {result is not None}")
+    # Test cache statistics
+    print("\n3. Cache Statistics:")
+    stats = cache.get_cache_stats()
+    print(f"  Total entries: {stats['total_entries']}")
+    print(f"  Valid entries: {stats['valid_entries']}")
+    print(f"  Normalized groups: {stats['normalized_groups']}")
+    print("\n✅ Smart cache tests completed!")
+if __name__ == "__main__":
+    test_smart_cache()