Spaces:

MHamdan
/

SPARKNET

Sleeping

MHamdan commited on 4 days ago

Commit

76c3b0a

1 Parent(s): 4d295a8

Enhance SPARKNET for TTO automation with new scenarios and security features

- Update frontend with TTO branding and coverage dashboard
- Add 5 core scenarios: Patent Wake-Up, Agreement Safety, Partner Matching, License Compliance, Award Identification
- Add CriticAgent validation visibility and confidence scoring
- Create License Compliance Monitoring module (scenario3)
- Create Award Identification module (scenario4)
- Add .env.example with comprehensive API key template
- Enhance secrets.toml.example for Streamlit Cloud
- Add SECURITY.md with GDPR compliance documentation
- Add human-in-the-loop decision points and source verification

Files changed (15) hide show

.env.example +133 -0
.streamlit/secrets.toml.example +135 -10
SECURITY.md +342 -0
demo/app.py +583 -55
demo/auth.py +42 -0
demo/llm_providers.py +619 -191
src/agents/scenario3/__init__.py +42 -0
src/agents/scenario3/license_compliance_agent.py +321 -0
src/agents/scenario3/milestone_verifier.py +344 -0
src/agents/scenario3/payment_tracker.py +277 -0
src/agents/scenario4/__init__.py +36 -0
src/agents/scenario4/award_identification_agent.py +392 -0
src/agents/scenario4/nomination_assistant.py +371 -0
src/agents/scenario4/opportunity_scanner.py +278 -0
src/workflow/langgraph_state.py +233 -2

.env.example ADDED Viewed

	@@ -0,0 +1,133 @@

+# ============================================================================
+# SPARKNET Environment Configuration
+# ============================================================================
+# Copy this file to .env and fill in your API keys
+# NEVER commit .env to version control!
+#
+# For Streamlit Cloud deployment, add these to .streamlit/secrets.toml instead
+# ============================================================================
+# ============================================================================
+# LLM Provider API Keys (Configure at least one for AI features)
+# ============================================================================
+# Groq - Fastest inference, 14,400 requests/day free
+# Get key: https://console.groq.com/keys
+GROQ_API_KEY=
+# Google Gemini/AI Studio - 15 requests/min free
+# Get key: https://aistudio.google.com/apikey
+GOOGLE_API_KEY=
+# OpenRouter - Access to many free models with single API
+# Get key: https://openrouter.ai/keys
+OPENROUTER_API_KEY=
+# GitHub Models - Free GPT-4o, Llama 3.1 access
+# Get token: https://github.com/settings/tokens (enable 'models' scope)
+GITHUB_TOKEN=
+# HuggingFace - Thousands of free models, embeddings
+# Get token: https://huggingface.co/settings/tokens
+HF_TOKEN=
+# Together AI - $25 free credits
+# Get key: https://www.together.ai/
+TOGETHER_API_KEY=
+# Mistral AI - Free experiment plan
+# Get key: https://console.mistral.ai/
+MISTRAL_API_KEY=
+# ============================================================================
+# Premium/Paid Providers (Optional)
+# ============================================================================
+# OpenAI - For GPT-4, embeddings (paid)
+# Get key: https://platform.openai.com/api-keys
+OPENAI_API_KEY=
+# Anthropic Claude - For Claude models (paid)
+# Get key: https://console.anthropic.com/
+ANTHROPIC_API_KEY=
+# ============================================================================
+# Local Inference (Ollama)
+# ============================================================================
+# Ollama server configuration (default: http://localhost:11434)
+OLLAMA_HOST=http://localhost:11434
+OLLAMA_DEFAULT_MODEL=llama3.2:latest
+# ============================================================================
+# Vector Store / Database Configuration
+# ============================================================================
+# ChromaDB settings (local by default)
+CHROMA_PERSIST_DIR=./data/chroma
+# PostgreSQL (for production deployments)
+# DATABASE_URL=postgresql://user:password@localhost:5432/sparknet
+# ============================================================================
+# Security & Authentication
+# ============================================================================
+# Application secret key (generate with: python -c "import secrets; print(secrets.token_hex(32))")
+SECRET_KEY=
+# Demo authentication password (for Streamlit demo)
+# For production, use proper authentication system
+DEMO_PASSWORD=
+# ============================================================================
+# GDPR & Data Privacy Configuration
+# ============================================================================
+#
+# IMPORTANT: For EU deployments, ensure compliance with:
+# - GDPR (General Data Protection Regulation)
+# - Law 25 (Quebec privacy law) if applicable
+# - Local data residency requirements
+#
+# Options for private/on-premise deployment:
+# 1. Use Ollama for 100% local inference (no data leaves your network)
+# 2. Configure data retention policies in your database
+# 3. Enable audit logging for data access tracking
+# 4. Implement data anonymization for sensitive documents
+#
+# See SECURITY.md for detailed deployment guidelines
+# ============================================================================
+# Enable audit logging
+AUDIT_LOG_ENABLED=false
+AUDIT_LOG_PATH=./logs/audit.log
+# Data retention (days, 0 = indefinite)
+DATA_RETENTION_DAYS=0
+# Enable PII detection and masking
+PII_DETECTION_ENABLED=false
+# ============================================================================
+# Feature Flags
+# ============================================================================
+# Enable experimental features
+ENABLE_EXPERIMENTAL=false
+# Enable GPU acceleration
+ENABLE_GPU=true
+# Enable caching
+ENABLE_CACHE=true
+CACHE_TTL_SECONDS=3600
+# ============================================================================
+# Logging & Monitoring
+# ============================================================================
+# Log level: DEBUG, INFO, WARNING, ERROR
+LOG_LEVEL=INFO
+# Sentry DSN for error tracking (optional)
+# SENTRY_DSN=

.streamlit/secrets.toml.example CHANGED Viewed

@@ -1,14 +1,139 @@
-# SPARKNET Secrets Configuration
-# Copy this to secrets.toml (DO NOT commit secrets.toml!)
-# Authentication - set your password
 [auth]
-password = "your-secure-password-here"
-# Or for multiple users:
-# [auth]
-# passwords = { admin = "admin123", viewer = "viewer456" }
-# API Keys (free tiers)
-GROQ_API_KEY = "your-groq-key"
-HF_TOKEN = "your-huggingface-token"

+# ============================================================================
+# SPARKNET - Streamlit Secrets Configuration
+# ============================================================================
+# Copy this file to secrets.toml (DO NOT commit secrets.toml!)
+# For Streamlit Cloud: Add these via the Streamlit Cloud dashboard
+#
+# VISTA/Horizon EU Project - Technology Transfer Office Automation
+# ============================================================================
+# ============================================================================
+# Authentication (Required)
+# ============================================================================
 [auth]
+# Single user mode
+password = "your-secure-password"
+# Multi-user mode (uncomment to use):
+# [auth.users]
+# admin = "admin-password-here"
+# viewer = "viewer-password-here"
+# analyst = "analyst-password-here"
+# ============================================================================
+# LLM Provider API Keys
+# ============================================================================
+# Add only the providers you want to use - system auto-selects best available
+# Priority: Groq > Gemini > OpenRouter > GitHub > Together > Mistral > HuggingFace > Offline
+# Groq - Fastest inference, 14,400 requests/day free
+# Get key: https://console.groq.com/keys
+GROQ_API_KEY = ""
+# Google Gemini/AI Studio - 15 requests/min free
+# Get key: https://aistudio.google.com/apikey
+GOOGLE_API_KEY = ""
+# OpenRouter - Access to many free models with single API
+# Get key: https://openrouter.ai/keys
+OPENROUTER_API_KEY = ""
+# GitHub Models - Free GPT-4o, Llama 3.1 access
+# Get token: https://github.com/settings/tokens (enable 'models' scope)
+GITHUB_TOKEN = ""
+# HuggingFace - Thousands of free models, embeddings
+# Get token: https://huggingface.co/settings/tokens
+HF_TOKEN = ""
+# Together AI - $25 free credits
+# Get key: https://www.together.ai/
+TOGETHER_API_KEY = ""
+# Mistral AI - Free experiment plan
+# Get key: https://console.mistral.ai/
+MISTRAL_API_KEY = ""
+# ============================================================================
+# Premium/Paid Providers (Optional)
+# ============================================================================
+# OpenAI - GPT-4, embeddings (paid)
+# Get key: https://platform.openai.com/api-keys
+OPENAI_API_KEY = ""
+# Anthropic Claude (paid)
+# Get key: https://console.anthropic.com/
+ANTHROPIC_API_KEY = ""
+# ============================================================================
+# Database Configuration (Optional - for production)
+# ============================================================================
+[database]
+# PostgreSQL connection (uncomment for production)
+# url = "postgresql://user:password@host:5432/sparknet"
+# ChromaDB persistence directory
+chroma_persist_dir = "./data/chroma"
+# ============================================================================
+# Security Configuration
+# ============================================================================
+[security]
+# Secret key for session management (generate with: python -c "import secrets; print(secrets.token_hex(32))")
+secret_key = ""
+# Enable audit logging
+audit_logging = false
+# ============================================================================
+# GDPR & Data Privacy
+# ============================================================================
+# IMPORTANT: For EU/VISTA deployments, configure these settings
+[privacy]
+# Data retention in days (0 = indefinite)
+data_retention_days = 0
+# Enable PII detection and masking
+pii_detection = false
+# Enable data anonymization for exports
+anonymize_exports = false
+# ============================================================================
+# Feature Flags
+# ============================================================================
+[features]
+# Enable experimental scenarios
+experimental_scenarios = false
+# Enable GPU acceleration (requires CUDA)
+gpu_enabled = true
+# Enable response caching
+caching_enabled = true
+cache_ttl_seconds = 3600
+# ============================================================================
+# Private Deployment Notes
+# ============================================================================
+# For enterprise/private deployments:
+#
+# 1. LOCAL INFERENCE (Maximum Privacy):
+#    - Use Ollama for 100% on-premise inference
+#    - No data leaves your network
+#    - Set OLLAMA_HOST = "http://localhost:11434"
+#
+# 2. HYBRID DEPLOYMENT:
+#    - Use local Ollama for sensitive documents
+#    - Use cloud LLMs for non-sensitive queries
+#    - Configure document classification rules
+#
+# 3. CLOUD DEPLOYMENT (Streamlit Cloud):
+#    - Use secrets management (this file)
+#    - Enable audit logging
+#    - Configure data retention policies
+#    - Review GDPR compliance checklist
+#
+# See DEPLOYMENT.md for detailed instructions
+# ============================================================================

SECURITY.md ADDED Viewed

	@@ -0,0 +1,342 @@

+# SPARKNET Security Documentation
+This document outlines security considerations, deployment options, and compliance
+guidelines for the SPARKNET AI-Powered Technology Transfer Office Automation Platform.
+## Overview
+SPARKNET handles sensitive data including:
+- Patent documents and IP information
+- License agreements and financial terms
+- Partner/stakeholder contact information
+- Research data and findings
+Proper security measures are essential for production deployments.
+---
+## Deployment Options
+### 1. Fully Local Deployment (Maximum Privacy)
+**Recommended for:** Organizations with strict data sovereignty requirements, classified research, or GDPR Article 17 obligations.
+```
+┌─────────────────────────────────────────────────────────────┐
+│                    Your Private Network                      │
+│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────┐  │
+│  │  SPARKNET   │──│   Ollama    │──│  Local Vector Store │  │
+│  │  (Streamlit)│  │  (LLM)      │  │  (ChromaDB)         │  │
+│  └─────────────┘  └─────────────┘  └─────────────────────┘  │
+│         │                                                    │
+│  ┌─────────────┐  ┌─────────────────────────────────────┐   │
+│  │  PostgreSQL │  │  Document Storage (NFS/S3-compat)   │   │
+│  │  (metadata) │  │                                     │   │
+│  └─────────────┘  └─────────────────────────────────────┘   │
+└─────────────────────────────────────────────────────────────┘
+```
+**Configuration:**
+- Set no cloud API keys in `.env`
+- System automatically uses Ollama for all inference
+- All data remains within your network
+- No external API calls for LLM inference
+**Setup:**
+```bash
+# Install Ollama
+curl -fsSL https://ollama.com/install.sh | sh
+# Pull required models
+ollama pull llama3.2:latest
+ollama pull nomic-embed-text
+# Configure SPARKNET
+cp .env.example .env
+# Leave cloud API keys empty
+# Run
+streamlit run demo/app.py
+```
+### 2. Hybrid Deployment (Balanced)
+**Recommended for:** Organizations that want cloud LLM capabilities for non-sensitive operations while keeping sensitive data local.
+```
+┌─────────────────────────────────────────────────────────────┐
+│                    Your Private Network                      │
+│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────┐  │
+│  │  SPARKNET   │──│   Ollama    │──│  Document Storage   │  │
+│  │  (Streamlit)│  │  (Sensitive)│  │  (Encrypted)        │  │
+│  └──────┬──────┘  └─────────────┘  └─────────────────────┘  │
+└─────────│───────────────────────────────────────────────────┘
+          │
+          │ (Non-sensitive queries only)
+          ▼
+┌─────────────────────────────────────────────────────────────┐
+│                    Cloud LLM Providers                       │
+│  ┌─────────┐  ┌─────────┐  ┌─────────────┐  ┌───────────┐  │
+│  │  Groq   │  │ Gemini  │  │ OpenRouter  │  │  GitHub   │  │
+│  └─────────┘  └─────────┘  └─────────────┘  └───────────┘  │
+└─────────────────────────────────────────────────────────────┘
+```
+**Configuration:**
+- Configure cloud API keys for general queries
+- Use document sensitivity classification
+- Route sensitive documents to local Ollama
+- Implement data anonymization for cloud queries
+### 3. Cloud Deployment (Streamlit Cloud)
+**Recommended for:** Public demos, non-sensitive research, or when local infrastructure is not available.
+**Configuration:**
+```toml
+# .streamlit/secrets.toml
+[auth]
+password = "your-secure-password"
+GROQ_API_KEY = "your-key"
+GOOGLE_API_KEY = "your-key"
+```
+**Security Checklist:**
+- [ ] Use secrets management (never commit API keys)
+- [ ] Enable authentication
+- [ ] Review provider data processing policies
+- [ ] Consider data anonymization
+- [ ] Implement session timeouts
+---
+## GDPR Compliance
+### Data Processing Principles
+SPARKNET is designed to support GDPR compliance:
+1. **Lawfulness, Fairness, Transparency**
+   - Document all data processing activities
+   - Obtain appropriate consent for personal data
+   - Provide clear privacy notices
+2. **Purpose Limitation**
+   - Use data only for stated TTO purposes
+   - Do not repurpose data without consent
+3. **Data Minimization**
+   - Only process necessary data
+   - Anonymize data when possible
+   - Implement data retention policies
+4. **Accuracy**
+   - CriticAgent validation helps ensure accuracy
+   - Human-in-the-loop for critical decisions
+   - Source verification for claims
+5. **Storage Limitation**
+   - Configure `DATA_RETENTION_DAYS` in `.env`
+   - Implement automatic data purging
+   - Support data deletion requests
+6. **Integrity and Confidentiality**
+   - Encrypt data at rest
+   - Use TLS for data in transit
+   - Implement access controls
+### Data Subject Rights
+Support for GDPR data subject rights:
+| Right | Implementation |
+|-------|----------------|
+| Access | Export function for user data |
+| Rectification | Edit capabilities in UI |
+| Erasure | Delete user data on request |
+| Portability | JSON/CSV export options |
+| Objection | Opt-out from AI processing |
+### Cross-Border Data Transfers
+When using cloud LLM providers:
+1. **EU-US Data Transfers:**
+   - Review provider's Data Processing Agreement
+   - Ensure Standard Contractual Clauses in place
+   - Consider EU-hosted alternatives
+2. **Recommended Approach:**
+   - Use Ollama for EU data residency
+   - Anonymize data before cloud API calls
+   - Implement geographic routing
+---
+## Security Best Practices
+### API Key Management
+```python
+# GOOD: Load from environment/secrets
+api_key = os.environ.get("GROQ_API_KEY")
+# or
+api_key = st.secrets.get("GROQ_API_KEY")
+# BAD: Hardcoded keys
+api_key = "gsk_abc123..."  # NEVER DO THIS
+```
+### Authentication
+Configure authentication in `.streamlit/secrets.toml`:
+```toml
+[auth]
+# Single user
+password = "strong-password-here"
+# Multi-user
+[auth.users]
+admin = "admin-password"
+analyst = "analyst-password"
+viewer = "viewer-password"
+```
+### Audit Logging
+Enable audit logging for compliance:
+```env
+AUDIT_LOG_ENABLED=true
+AUDIT_LOG_PATH=./logs/audit.log
+```
+Audit log includes:
+- User authentication events
+- Document access
+- AI query/response pairs
+- Decision point approvals
+### Network Security
+For production deployments:
+1. **Firewall Rules:**
+   - Restrict Ollama to internal network
+   - Limit database access to app servers
+   - Use VPN for remote access
+2. **TLS/SSL:**
+   - Enable HTTPS for Streamlit
+   - Use encrypted database connections
+   - Secure WebSocket connections
+3. **Access Control:**
+   - Implement role-based access
+   - Use IP allowlisting where possible
+   - Enable MFA for admin access
+---
+## Sensitive Data Handling
+### Document Classification
+SPARKNET can classify documents by sensitivity:
+| Level | Description | Handling |
+|-------|-------------|----------|
+| Public | Non-confidential | Cloud LLM allowed |
+| Internal | Business confidential | Prefer local |
+| Confidential | Sensitive business | Local only |
+| Restricted | Highly sensitive | Local + encryption |
+### PII Detection
+Enable PII detection:
+```env
+PII_DETECTION_ENABLED=true
+```
+Detected PII types:
+- Names (persons)
+- Email addresses
+- Phone numbers
+- Addresses
+- ID numbers
+### Data Anonymization
+For cloud API calls, implement anonymization:
+```python
+# Pseudonymization example
+text = text.replace(real_name, "[PERSON_1]")
+text = text.replace(company_name, "[COMPANY_1]")
+```
+---
+## Incident Response
+### Security Incident Procedure
+1. **Detection:** Monitor audit logs and alerts
+2. **Containment:** Isolate affected systems
+3. **Investigation:** Determine scope and impact
+4. **Notification:** Inform stakeholders (72h for GDPR)
+5. **Recovery:** Restore from clean backups
+6. **Lessons Learned:** Update security measures
+### Contact
+For security issues:
+- Review issue privately before public disclosure
+- Report to project maintainers
+- Follow responsible disclosure practices
+---
+## Compliance Checklist
+### Pre-Deployment
+- [ ] API keys stored in secrets management
+- [ ] Authentication configured
+- [ ] Audit logging enabled
+- [ ] Data retention policy defined
+- [ ] Backup strategy implemented
+- [ ] Network security reviewed
+### GDPR Compliance
+- [ ] Data processing register updated
+- [ ] Privacy notice published
+- [ ] Data subject rights procedures in place
+- [ ] Cross-border transfer safeguards
+- [ ] Data Protection Impact Assessment (if required)
+### Ongoing
+- [ ] Regular security audits
+- [ ] Log review and monitoring
+- [ ] Access control review
+- [ ] Incident response testing
+- [ ] Staff security training
+---
+## Additional Resources
+- [GDPR Official Text](https://gdpr.eu/)
+- [Ollama Documentation](https://ollama.com/)
+- [Streamlit Security](https://docs.streamlit.io/deploy/streamlit-community-cloud/security)
+- [OWASP Top 10](https://owasp.org/Top10/)
+---
+*SPARKNET - VISTA/Horizon EU Project*
+*Last Updated: 2025*

demo/app.py CHANGED Viewed

@@ -1,12 +1,24 @@
 """
-SPARKNET Demo Application
-A Streamlit-based demo showcasing:
-- Document Processing Pipeline
-- Field Extraction with Evidence
-- RAG Search and Q&A
-- Document Classification
-- Evidence Visualization
 """
 import streamlit as st
@@ -23,10 +35,13 @@ sys.path.insert(0, str(PROJECT_ROOT))
 # Page configuration - MUST be first Streamlit command
 st.set_page_config(
-    page_title="SPARKNET Document Intelligence",
     page_icon="🔥",
     layout="wide",
     initial_sidebar_state="expanded",
 )
 # Authentication - require login before showing app
@@ -85,6 +100,95 @@ st.markdown("""
         background-color: #f0f2f6;
         border-radius: 8px;
     }
 </style>
 """, unsafe_allow_html=True)
@@ -107,14 +211,217 @@ def format_confidence(confidence: float) -> str:
         return f'<span class="confidence-low">{confidence:.1%}</span>'
 def render_header():
-    """Render the main header."""
     col1, col2 = st.columns([3, 1])
     with col1:
         st.markdown('<div class="main-header">🔥 SPARKNET</div>', unsafe_allow_html=True)
-        st.markdown('<div class="sub-header">Agentic Document Intelligence Platform</div>', unsafe_allow_html=True)
     with col2:
-        st.image("https://img.shields.io/badge/version-0.1.0-blue", width=100)
 def render_sidebar():
@@ -174,31 +481,222 @@ def check_chromadb_status():
 def render_home_page():
-    """Render the home page."""
-    st.markdown("## Welcome to SPARKNET")
     st.markdown("""
-    SPARKNET is an enterprise-grade **Agentic Document Intelligence Platform** that combines:
-    - **📄 Document Processing**: OCR with PaddleOCR/Tesseract, layout detection, semantic chunking
-    - **🔍 RAG Subsystem**: Vector search with ChromaDB, grounded retrieval with citations
-    - **🤖 Multi-Agent System**: ReAct-style agents with tool use and validation
-    - **🏠 Local-First**: Privacy-preserving inference via Ollama
-    - **📎 Evidence Grounding**: Every extraction includes bbox, page, chunk_id references
     """)
     st.markdown("---")
-    # Feature cards
     col1, col2, col3, col4 = st.columns(4)
     with col1:
         st.markdown("""
         <div style="background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
                     border-radius: 12px; padding: 1.5rem; color: white; text-align: center;">
-            <h3>📄</h3>
-            <h4>Document Processing</h4>
-            <p style="font-size: 0.9rem;">OCR, Layout Detection, Chunking</p>
         </div>
         """, unsafe_allow_html=True)
@@ -206,9 +704,9 @@ def render_home_page():
         st.markdown("""
         <div style="background: linear-gradient(135deg, #f093fb 0%, #f5576c 100%);
                     border-radius: 12px; padding: 1.5rem; color: white; text-align: center;">
-            <h3>🔍</h3>
-            <h4>Field Extraction</h4>
-            <p style="font-size: 0.9rem;">Structured Data with Evidence</p>
         </div>
         """, unsafe_allow_html=True)
@@ -216,9 +714,9 @@ def render_home_page():
         st.markdown("""
         <div style="background: linear-gradient(135deg, #4facfe 0%, #00f2fe 100%);
                     border-radius: 12px; padding: 1.5rem; color: white; text-align: center;">
-            <h3>💬</h3>
-            <h4>RAG Q&A</h4>
-            <p style="font-size: 0.9rem;">Grounded Answers with Citations</p>
         </div>
         """, unsafe_allow_html=True)
@@ -226,29 +724,34 @@ def render_home_page():
         st.markdown("""
         <div style="background: linear-gradient(135deg, #43e97b 0%, #38f9d7 100%);
                     border-radius: 12px; padding: 1.5rem; color: white; text-align: center;">
-            <h3>🏷️</h3>
-            <h4>Classification</h4>
-            <p style="font-size: 0.9rem;">Document Type Detection</p>
         </div>
         """, unsafe_allow_html=True)
     st.markdown("---")
     # Quick start
-    st.markdown("### Quick Start")
-    with st.expander("📚 How to Use This Demo", expanded=True):
         st.markdown("""
-        1. **Document Processing**: Upload or select a PDF to process with OCR
-        2. **Field Extraction**: Define fields to extract with evidence grounding
-        3. **RAG Q&A**: Ask questions about indexed documents
-        4. **Classification**: Automatically classify document types
-        **Sample Documents**: The demo includes real patent documents from major tech companies.
         """)
     # Sample documents preview
-    st.markdown("### Available Sample Documents")
     docs = get_sample_documents()
     if docs:
@@ -624,20 +1127,42 @@ def extract_fields_demo(doc_name, fields, validate, include_evidence):
             st.markdown("")
-    # Validation results
     if validate:
         st.markdown("---")
-        st.markdown("### Validation Results")
-        col1, col2, col3 = st.columns(3)
-        with col1:
-            st.metric("Fields Validated", len(fields))
-        with col2:
-            st.metric("Valid", len(fields) - 1)
-        with col3:
-            st.metric("Uncertain", 1)
-        st.info("💡 Critic validation: All fields have supporting evidence in the document.")
 def render_rag_page():
@@ -942,9 +1467,12 @@ def main():
     # Footer
     st.markdown("---")
     st.markdown(
-        "<div style='text-align: center; color: #666;'>"
-        "🔥 SPARKNET Document Intelligence Platform | Built with Streamlit"
-        "</div>",
         unsafe_allow_html=True,
     )

 """
+SPARKNET - AI-Powered Technology Transfer Office (TTO) Automation Platform
+A comprehensive Streamlit-based platform for research valorization and IP management:
+CORE TTO SCENARIOS:
+1. Patent Wake-Up: Transform dormant patents into commercialization opportunities
+2. Agreement Safety: AI-assisted legal document review with risk detection
+3. Partner Matching: Intelligent stakeholder matching for technology transfer
+4. License Compliance Monitoring: Payment tracking, milestone verification, revenue alerts
+5. Award Identification: Funding opportunity scanning and nomination assistance
+FEATURES:
+- Multi-agent AI orchestration with CriticAgent validation
+- Document Intelligence with evidence grounding
+- RAG-powered search and Q&A with source verification
+- Confidence scoring and hallucination mitigation
+- Human-in-the-loop decision points
+- GDPR-compliant data handling options
+VISTA/Horizon EU Project - Supporting European research valorization
 """
 import streamlit as st
 # Page configuration - MUST be first Streamlit command
 st.set_page_config(
+    page_title="SPARKNET - TTO Automation Platform",
     page_icon="🔥",
     layout="wide",
     initial_sidebar_state="expanded",
+    menu_items={
+        'About': "SPARKNET: AI-Powered Technology Transfer Office Automation\n\nVISTA/Horizon EU Project"
+    }
 )
 # Authentication - require login before showing app
         background-color: #f0f2f6;
         border-radius: 8px;
     }
+    /* Coverage badges */
+    .coverage-full {
+        background: linear-gradient(135deg, #22c55e 0%, #16a34a 100%);
+        color: white;
+        padding: 0.3rem 0.8rem;
+        border-radius: 20px;
+        font-size: 0.75rem;
+        font-weight: bold;
+    }
+    .coverage-partial {
+        background: linear-gradient(135deg, #eab308 0%, #ca8a04 100%);
+        color: white;
+        padding: 0.3rem 0.8rem;
+        border-radius: 20px;
+        font-size: 0.75rem;
+        font-weight: bold;
+    }
+    .coverage-none {
+        background: linear-gradient(135deg, #94a3b8 0%, #64748b 100%);
+        color: white;
+        padding: 0.3rem 0.8rem;
+        border-radius: 20px;
+        font-size: 0.75rem;
+        font-weight: bold;
+    }
+    /* EU/VISTA badges */
+    .eu-badge {
+        background: linear-gradient(135deg, #003399 0%, #0052cc 100%);
+        color: #ffcc00;
+        padding: 0.4rem 1rem;
+        border-radius: 8px;
+        font-size: 0.8rem;
+        font-weight: bold;
+        display: inline-block;
+        margin: 0.2rem;
+    }
+    .vista-badge {
+        background: linear-gradient(135deg, #7c3aed 0%, #a855f7 100%);
+        color: white;
+        padding: 0.4rem 1rem;
+        border-radius: 8px;
+        font-size: 0.8rem;
+        font-weight: bold;
+        display: inline-block;
+        margin: 0.2rem;
+    }
+    /* Scenario cards */
+    .scenario-card {
+        background: white;
+        border: 1px solid #e5e7eb;
+        border-radius: 12px;
+        padding: 1.2rem;
+        margin: 0.5rem 0;
+        box-shadow: 0 2px 4px rgba(0,0,0,0.05);
+        transition: transform 0.2s, box-shadow 0.2s;
+    }
+    .scenario-card:hover {
+        transform: translateY(-2px);
+        box-shadow: 0 4px 12px rgba(0,0,0,0.1);
+    }
+    /* Validation indicator */
+    .validation-indicator {
+        display: inline-flex;
+        align-items: center;
+        gap: 0.5rem;
+        padding: 0.3rem 0.8rem;
+        border-radius: 6px;
+        font-size: 0.85rem;
+    }
+    .validation-pass {
+        background: #dcfce7;
+        color: #166534;
+    }
+    .validation-warn {
+        background: #fef3c7;
+        color: #92400e;
+    }
+    .validation-fail {
+        background: #fecaca;
+        color: #991b1b;
+    }
+    /* Human-in-the-loop button */
+    .hitl-prompt {
+        background: linear-gradient(135deg, #fef3c7 0%, #fde68a 100%);
+        border: 2px solid #f59e0b;
+        border-radius: 8px;
+        padding: 1rem;
+        margin: 1rem 0;
+    }
 </style>
 """, unsafe_allow_html=True)
         return f'<span class="confidence-low">{confidence:.1%}</span>'
+def render_critic_validation(validation_result: dict) -> None:
+    """
+    Render CriticAgent validation results in the UI.
+    Displays validation scores, issues, and suggestions
+    with clear visual indicators.
+    """
+    overall_score = validation_result.get("overall_score", 0.0)
+    is_valid = validation_result.get("valid", False)
+    dimension_scores = validation_result.get("dimension_scores", {})
+    issues = validation_result.get("issues", [])
+    suggestions = validation_result.get("suggestions", [])
+    # Overall validation status
+    if is_valid and overall_score >= 0.85:
+        status_class = "validation-pass"
+        status_icon = "✓"
+        status_text = "Validated"
+    elif overall_score >= 0.6:
+        status_class = "validation-warn"
+        status_icon = "⚠"
+        status_text = "Review Recommended"
+    else:
+        status_class = "validation-fail"
+        status_icon = "✗"
+        status_text = "Validation Failed"
+    st.markdown(f"""
+    <div style="background: #f8fafc; border-radius: 12px; padding: 1rem; margin: 1rem 0; border: 1px solid #e2e8f0;">
+        <div style="display: flex; justify-content: space-between; align-items: center; margin-bottom: 1rem;">
+            <h4 style="margin: 0;">🛡️ CriticAgent Validation</h4>
+            <span class="validation-indicator {status_class}">{status_icon} {status_text}</span>
+        </div>
+        <div style="display: flex; gap: 1rem; flex-wrap: wrap;">
+            <div style="flex: 1; min-width: 200px;">
+                <strong>Overall Score</strong>
+                <div style="font-size: 2rem; font-weight: bold; color: {'#22c55e' if overall_score >= 0.8 else '#eab308' if overall_score >= 0.6 else '#ef4444'};">
+                    {overall_score:.0%}
+                </div>
+            </div>
+    """, unsafe_allow_html=True)
+    # Dimension scores
+    if dimension_scores:
+        st.markdown("**Quality Dimensions:**")
+        cols = st.columns(len(dimension_scores))
+        for i, (dim, score) in enumerate(dimension_scores.items()):
+            with cols[i]:
+                st.metric(
+                    dim.replace("_", " ").title(),
+                    f"{score:.0%}",
+                    delta=None,
+                )
+    # Issues
+    if issues:
+        with st.expander("⚠️ Issues Found", expanded=len(issues) <= 3):
+            for issue in issues:
+                st.markdown(f"- {issue}")
+    # Suggestions
+    if suggestions:
+        with st.expander("💡 Improvement Suggestions"):
+            for suggestion in suggestions:
+                st.markdown(f"- {suggestion}")
+    st.markdown("</div></div>", unsafe_allow_html=True)
+def render_source_verification(sources: list, claim: str = "") -> None:
+    """
+    Render source verification for hallucination mitigation.
+    Shows the sources used to generate AI responses with
+    verification status.
+    """
+    st.markdown("""
+    <div style="background: #f0fdf4; border-radius: 8px; padding: 1rem; border: 1px solid #bbf7d0;">
+        <h5 style="margin: 0 0 0.5rem 0;">📎 Source Verification</h5>
+    """, unsafe_allow_html=True)
+    if sources:
+        verified_count = sum(1 for s in sources if s.get("verified", False))
+        total_count = len(sources)
+        st.markdown(f"""
+        <div style="margin-bottom: 0.5rem;">
+            <span style="color: #166534;">✓ {verified_count}/{total_count} sources verified</span>
+        </div>
+        """, unsafe_allow_html=True)
+        for i, source in enumerate(sources[:5]):  # Show top 5 sources
+            verified = source.get("verified", False)
+            page = source.get("page", "N/A")
+            snippet = source.get("snippet", "")[:100]
+            confidence = source.get("confidence", 0.0)
+            st.markdown(f"""
+            <div style="background: white; border-radius: 4px; padding: 0.5rem; margin: 0.3rem 0; border-left: 3px solid {'#22c55e' if verified else '#eab308'};">
+                <small>
+                    <strong>[{i+1}]</strong> Page {page} | Confidence: {confidence:.0%}
+                    {' ✓' if verified else ' ⚠'}
+                    <br>
+                    <em>"{snippet}..."</em>
+                </small>
+            </div>
+            """, unsafe_allow_html=True)
+    else:
+        st.markdown("""
+        <p style="color: #666; margin: 0;">No source verification available for this response.</p>
+        """, unsafe_allow_html=True)
+    st.markdown("</div>", unsafe_allow_html=True)
+def render_human_decision_point(
+    question: str,
+    options: list,
+    ai_recommendation: str = None,
+    ai_confidence: float = None,
+) -> str:
+    """
+    Render a human-in-the-loop decision point.
+    Shows AI recommendation but requires human approval
+    for critical decisions.
+    Returns:
+        Selected option from human
+    """
+    st.markdown("""
+    <div class="hitl-prompt">
+        <h4 style="margin: 0 0 0.5rem 0;">👤 Human Decision Required</h4>
+    """, unsafe_allow_html=True)
+    st.markdown(f"**{question}**")
+    if ai_recommendation and ai_confidence:
+        st.markdown(f"""
+        <div style="background: white; border-radius: 4px; padding: 0.5rem; margin: 0.5rem 0;">
+            <small>
+                <strong>AI Recommendation:</strong> {ai_recommendation}
+                (Confidence: {ai_confidence:.0%})
+            </small>
+        </div>
+        """, unsafe_allow_html=True)
+    selected = st.radio(
+        "Your decision:",
+        options,
+        label_visibility="collapsed",
+        key=f"hitl_{hash(question)}",
+    )
+    st.markdown("</div>", unsafe_allow_html=True)
+    return selected
+def render_confidence_indicator(confidence: float, label: str = "Confidence") -> None:
+    """
+    Render a visual confidence indicator.
+    Shows confidence as a progress bar with color coding.
+    """
+    if confidence >= 0.8:
+        color = "#22c55e"
+        status = "High"
+    elif confidence >= 0.6:
+        color = "#eab308"
+        status = "Medium"
+    else:
+        color = "#ef4444"
+        status = "Low"
+    st.markdown(f"""
+    <div style="margin: 0.5rem 0;">
+        <div style="display: flex; justify-content: space-between; margin-bottom: 0.3rem;">
+            <small><strong>{label}</strong></small>
+            <small style="color: {color};">{status} ({confidence:.0%})</small>
+        </div>
+        <div style="background: #e5e7eb; border-radius: 4px; height: 8px; overflow: hidden;">
+            <div style="background: {color}; width: {confidence*100}%; height: 100%;"></div>
+        </div>
+    </div>
+    """, unsafe_allow_html=True)
 def render_header():
+    """Render the main header with TTO branding and EU badges."""
     col1, col2 = st.columns([3, 1])
     with col1:
         st.markdown('<div class="main-header">🔥 SPARKNET</div>', unsafe_allow_html=True)
+        st.markdown('<div class="sub-header">AI-Powered Technology Transfer Office Automation Platform</div>', unsafe_allow_html=True)
+        # EU/VISTA alignment badges
+        st.markdown('''
+        <div style="margin-top: 0.5rem;">
+            <span class="vista-badge">VISTA Project</span>
+            <span class="eu-badge">Horizon EU</span>
+        </div>
+        ''', unsafe_allow_html=True)
     with col2:
+        st.markdown('''
+        <div style="text-align: right;">
+            <img src="https://img.shields.io/badge/version-1.0.0-blue" style="margin: 2px;">
+            <br>
+            <img src="https://img.shields.io/badge/scenarios-5-green" style="margin: 2px;">
+            <br>
+            <img src="https://img.shields.io/badge/status-production-success" style="margin: 2px;">
+        </div>
+        ''', unsafe_allow_html=True)
 def render_sidebar():
 def render_home_page():
+    """Render the TTO dashboard home page with scenarios and coverage metrics."""
+    st.markdown("## Technology Transfer Office Dashboard")
     st.markdown("""
+    SPARKNET is a comprehensive **AI-Powered Technology Transfer Office (TTO) Automation Platform**
+    designed for research valorization and IP management. Built for the VISTA/Horizon EU project,
+    it combines multi-agent AI orchestration with document intelligence to automate key TTO workflows.
     """)
     st.markdown("---")
+    # =========================================================================
+    # COVERAGE METRICS DASHBOARD
+    # =========================================================================
+    st.markdown("### 📊 TTO Task Coverage Dashboard")
+    col1, col2, col3 = st.columns(3)
+    with col1:
+        st.markdown("""
+        <div style="background: linear-gradient(135deg, #22c55e 0%, #16a34a 100%);
+                    border-radius: 12px; padding: 1.5rem; color: white; text-align: center;">
+            <h1 style="margin: 0; font-size: 3rem;">3</h1>
+            <h4 style="margin: 0.5rem 0;">Fully Covered</h4>
+            <p style="font-size: 0.85rem; opacity: 0.9;">Production-ready scenarios</p>
+        </div>
+        """, unsafe_allow_html=True)
+    with col2:
+        st.markdown("""
+        <div style="background: linear-gradient(135deg, #eab308 0%, #ca8a04 100%);
+                    border-radius: 12px; padding: 1.5rem; color: white; text-align: center;">
+            <h1 style="margin: 0; font-size: 3rem;">5</h1>
+            <h4 style="margin: 0.5rem 0;">Partially Covered</h4>
+            <p style="font-size: 0.85rem; opacity: 0.9;">In development</p>
+        </div>
+        """, unsafe_allow_html=True)
+    with col3:
+        st.markdown("""
+        <div style="background: linear-gradient(135deg, #94a3b8 0%, #64748b 100%);
+                    border-radius: 12px; padding: 1.5rem; color: white; text-align: center;">
+            <h1 style="margin: 0; font-size: 3rem;">2</h1>
+            <h4 style="margin: 0.5rem 0;">Not Covered</h4>
+            <p style="font-size: 0.85rem; opacity: 0.9;">Planned for future</p>
+        </div>
+        """, unsafe_allow_html=True)
+    st.markdown("---")
+    # =========================================================================
+    # CORE TTO SCENARIOS
+    # =========================================================================
+    st.markdown("### 🎯 Core TTO Scenarios")
+    # Fully Covered Scenarios
+    st.markdown("#### Fully Implemented")
+    col1, col2, col3 = st.columns(3)
+    with col1:
+        st.markdown("""
+        <div class="scenario-card">
+            <div style="display: flex; justify-content: space-between; align-items: start;">
+                <h4 style="margin: 0;">💡 Patent Wake-Up</h4>
+                <span class="coverage-full">LIVE</span>
+            </div>
+            <p style="color: #666; margin: 0.5rem 0;">Transform dormant patents into commercialization opportunities</p>
+            <hr style="margin: 0.8rem 0; opacity: 0.3;">
+            <small>
+                <strong>Features:</strong><br>
+                • TRL Assessment<br>
+                • Market Analysis<br>
+                • Partner Matching<br>
+                • Valorization Brief Generation
+            </small>
+            <div style="margin-top: 0.8rem;">
+                <span class="vista-badge" style="font-size: 0.7rem; padding: 0.2rem 0.5rem;">VISTA Aligned</span>
+            </div>
+        </div>
+        """, unsafe_allow_html=True)
+    with col2:
+        st.markdown("""
+        <div class="scenario-card">
+            <div style="display: flex; justify-content: space-between; align-items: start;">
+                <h4 style="margin: 0;">⚖️ Agreement Safety</h4>
+                <span class="coverage-full">LIVE</span>
+            </div>
+            <p style="color: #666; margin: 0.5rem 0;">AI-assisted legal document review with risk detection</p>
+            <hr style="margin: 0.8rem 0; opacity: 0.3;">
+            <small>
+                <strong>Features:</strong><br>
+                • Risk Clause Detection<br>
+                • GDPR Compliance Check<br>
+                • Law 25 Alignment<br>
+                • Remediation Suggestions
+            </small>
+            <div style="margin-top: 0.8rem;">
+                <span class="eu-badge" style="font-size: 0.7rem; padding: 0.2rem 0.5rem;">GDPR Ready</span>
+            </div>
+        </div>
+        """, unsafe_allow_html=True)
+    with col3:
+        st.markdown("""
+        <div class="scenario-card">
+            <div style="display: flex; justify-content: space-between; align-items: start;">
+                <h4 style="margin: 0;">🤝 Partner Matching</h4>
+                <span class="coverage-full">LIVE</span>
+            </div>
+            <p style="color: #666; margin: 0.5rem 0;">Intelligent stakeholder matching for technology transfer</p>
+            <hr style="margin: 0.8rem 0; opacity: 0.3;">
+            <small>
+                <strong>Features:</strong><br>
+                • Multi-criteria Scoring<br>
+                • Geographic Matching<br>
+                • Technical Fit Analysis<br>
+                • Outreach Recommendations
+            </small>
+            <div style="margin-top: 0.8rem;">
+                <span class="vista-badge" style="font-size: 0.7rem; padding: 0.2rem 0.5rem;">VISTA Aligned</span>
+            </div>
+        </div>
+        """, unsafe_allow_html=True)
+    # Partially Covered Scenarios
+    st.markdown("#### In Development")
+    col1, col2 = st.columns(2)
+    with col1:
+        st.markdown("""
+        <div class="scenario-card" style="border-left: 4px solid #eab308;">
+            <div style="display: flex; justify-content: space-between; align-items: start;">
+                <h4 style="margin: 0;">📋 License Compliance Monitoring</h4>
+                <span class="coverage-partial">DEV</span>
+            </div>
+            <p style="color: #666; margin: 0.5rem 0;">Track license agreements and ensure compliance</p>
+            <hr style="margin: 0.8rem 0; opacity: 0.3;">
+            <small>
+                <strong>Planned Features:</strong><br>
+                • Payment Tracking & Alerts<br>
+                • Milestone Verification<br>
+                • Revenue Monitoring<br>
+                • Compliance Reporting
+            </small>
+        </div>
+        """, unsafe_allow_html=True)
+    with col2:
+        st.markdown("""
+        <div class="scenario-card" style="border-left: 4px solid #eab308;">
+            <div style="display: flex; justify-content: space-between; align-items: start;">
+                <h4 style="margin: 0;">🏆 Award Identification</h4>
+                <span class="coverage-partial">DEV</span>
+            </div>
+            <p style="color: #666; margin: 0.5rem 0;">Discover funding opportunities and awards</p>
+            <hr style="margin: 0.8rem 0; opacity: 0.3;">
+            <small>
+                <strong>Planned Features:</strong><br>
+                • Opportunity Scanning<br>
+                • Nomination Assistance<br>
+                • Deadline Tracking<br>
+                • Application Support
+            </small>
+        </div>
+        """, unsafe_allow_html=True)
+    st.markdown("---")
+    # =========================================================================
+    # AI QUALITY ASSURANCE
+    # =========================================================================
+    st.markdown("### 🛡️ AI Quality Assurance")
+    col1, col2, col3 = st.columns(3)
+    with col1:
+        st.markdown("""
+        <div style="background: #f0f9ff; border-radius: 8px; padding: 1rem; border: 1px solid #bae6fd;">
+            <h4 style="margin: 0 0 0.5rem 0;">🔍 CriticAgent Validation</h4>
+            <p style="font-size: 0.9rem; margin: 0;">Every AI output is validated against VISTA quality standards with dimension-based scoring.</p>
+        </div>
+        """, unsafe_allow_html=True)
+    with col2:
+        st.markdown("""
+        <div style="background: #f0fdf4; border-radius: 8px; padding: 1rem; border: 1px solid #bbf7d0;">
+            <h4 style="margin: 0 0 0.5rem 0;">📊 Confidence Scoring</h4>
+            <p style="font-size: 0.9rem; margin: 0;">All extractions include confidence scores with automatic abstention for low-confidence results.</p>
+        </div>
+        """, unsafe_allow_html=True)
+    with col3:
+        st.markdown("""
+        <div style="background: #fefce8; border-radius: 8px; padding: 1rem; border: 1px solid #fef08a;">
+            <h4 style="margin: 0 0 0.5rem 0;">👤 Human-in-the-Loop</h4>
+            <p style="font-size: 0.9rem; margin: 0;">Critical decisions require human approval with clear decision points throughout workflows.</p>
+        </div>
+        """, unsafe_allow_html=True)
+    st.markdown("---")
+    # =========================================================================
+    # PLATFORM CAPABILITIES
+    # =========================================================================
+    st.markdown("### 🚀 Platform Capabilities")
     col1, col2, col3, col4 = st.columns(4)
     with col1:
         st.markdown("""
         <div style="background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
                     border-radius: 12px; padding: 1.5rem; color: white; text-align: center;">
+            <h3 style="margin: 0;">📄</h3>
+            <h4 style="margin: 0.5rem 0;">Document Intelligence</h4>
+            <p style="font-size: 0.85rem; opacity: 0.9;">OCR, Layout, Chunking</p>
         </div>
         """, unsafe_allow_html=True)
         st.markdown("""
         <div style="background: linear-gradient(135deg, #f093fb 0%, #f5576c 100%);
                     border-radius: 12px; padding: 1.5rem; color: white; text-align: center;">
+            <h3 style="margin: 0;">🔍</h3>
+            <h4 style="margin: 0.5rem 0;">Evidence Grounding</h4>
+            <p style="font-size: 0.85rem; opacity: 0.9;">Source Verification</p>
         </div>
         """, unsafe_allow_html=True)
         st.markdown("""
         <div style="background: linear-gradient(135deg, #4facfe 0%, #00f2fe 100%);
                     border-radius: 12px; padding: 1.5rem; color: white; text-align: center;">
+            <h3 style="margin: 0;">💬</h3>
+            <h4 style="margin: 0.5rem 0;">RAG Q&A</h4>
+            <p style="font-size: 0.85rem; opacity: 0.9;">Grounded Citations</p>
         </div>
         """, unsafe_allow_html=True)
         st.markdown("""
         <div style="background: linear-gradient(135deg, #43e97b 0%, #38f9d7 100%);
                     border-radius: 12px; padding: 1.5rem; color: white; text-align: center;">
+            <h3 style="margin: 0;">🤖</h3>
+            <h4 style="margin: 0.5rem 0;">Multi-Agent AI</h4>
+            <p style="font-size: 0.85rem; opacity: 0.9;">Orchestrated Workflows</p>
         </div>
         """, unsafe_allow_html=True)
     st.markdown("---")
     # Quick start
+    st.markdown("### 📚 Quick Start Guide")
+    with st.expander("Getting Started with SPARKNET", expanded=True):
         st.markdown("""
+        **For TTO Staff:**
+        1. **Patent Wake-Up**: Upload a dormant patent to generate a valorization roadmap
+        2. **Agreement Safety**: Upload contracts/agreements for AI-assisted risk review
+        3. **Partner Matching**: Find suitable industry partners for your technologies
+        **For Researchers:**
+        1. **Document Processing**: Process research documents with OCR and extraction
+        2. **RAG Q&A**: Ask questions about indexed documents
+        3. **Evidence Viewer**: Verify AI responses with source grounding
+        **Sample Documents**: The demo includes patent documents from major tech companies for testing.
         """)
     # Sample documents preview
+    st.markdown("### 📁 Sample Documents")
     docs = get_sample_documents()
     if docs:
             st.markdown("")
+    # Validation results with CriticAgent visibility
     if validate:
         st.markdown("---")
+        # Demo validation result from CriticAgent
+        demo_validation = {
+            "valid": True,
+            "overall_score": 0.87,
+            "dimension_scores": {
+                "completeness": 0.92,
+                "clarity": 0.88,
+                "accuracy": 0.85,
+                "actionability": 0.82,
+            },
+            "issues": [
+                "Effective date confidence is below threshold (0.85)",
+            ],
+            "suggestions": [
+                "Consider manual verification of the effective date",
+                "Cross-reference parties with external sources",
+            ],
+        }
+        render_critic_validation(demo_validation)
+        # Source verification
+        demo_sources = [
+            {"page": 1, "snippet": "PATENT PLEDGE - This Patent Pledge is made by...", "verified": True, "confidence": 0.95},
+            {"page": 1, "snippet": "The company hereby pledges not to assert...", "verified": True, "confidence": 0.91},
+            {"page": 2, "snippet": "Covered Patents means all patents...", "verified": True, "confidence": 0.88},
+        ]
+        render_source_verification(demo_sources, "Patent pledge document analysis")
+        # Confidence indicator for overall extraction
+        render_confidence_indicator(0.89, "Extraction Confidence")
 def render_rag_page():
     # Footer
     st.markdown("---")
     st.markdown(
+        """<div style='text-align: center; color: #666;'>
+        🔥 SPARKNET - AI-Powered Technology Transfer Office Automation Platform<br>
+        <small>VISTA/Horizon EU Project | Built with Streamlit</small><br>
+        <span class="vista-badge" style="font-size: 0.7rem; padding: 0.2rem 0.5rem; margin-top: 0.5rem;">VISTA</span>
+        <span class="eu-badge" style="font-size: 0.7rem; padding: 0.2rem 0.5rem; margin-top: 0.5rem;">Horizon EU</span>
+        </div>""",
         unsafe_allow_html=True,
     )

demo/auth.py CHANGED Viewed

@@ -2,6 +2,48 @@
 Simple Password Authentication for SPARKNET
 Provides password-based access control for the Streamlit app.
 """
 import streamlit as st

 Simple Password Authentication for SPARKNET
 Provides password-based access control for the Streamlit app.
+SECURITY NOTES:
+---------------
+This module provides basic password authentication suitable for demos
+and internal deployments. For production use, consider:
+1. ENHANCED AUTHENTICATION:
+   - Integrate with OAuth/OIDC (Google, Azure AD, Okta)
+   - Use Streamlit's built-in OAuth support
+   - Implement multi-factor authentication (MFA)
+2. SESSION MANAGEMENT:
+   - Configure session timeouts (default: browser session)
+   - Implement session invalidation on logout
+   - Consider IP-based session binding
+3. PASSWORD SECURITY:
+   - Use strong password requirements
+   - Implement account lockout after failed attempts
+   - Store passwords hashed with bcrypt (not SHA-256) for production
+4. AUDIT LOGGING:
+   - Log authentication attempts (success/failure)
+   - Track user sessions
+   - Monitor for suspicious activity
+GDPR CONSIDERATIONS:
+-------------------
+- Authentication logs may contain personal data (usernames, IPs)
+- Implement data retention policies for auth logs
+- Support right-to-erasure for user accounts
+- Document authentication processing in GDPR records
+PRIVATE DEPLOYMENT:
+------------------
+For enterprise deployments:
+- Integrate with existing identity providers
+- Use LDAP/Active Directory for user management
+- Implement role-based access control (RBAC)
+- Enable single sign-on (SSO)
+See SECURITY.md for comprehensive security documentation.
 """
 import streamlit as st

demo/llm_providers.py CHANGED Viewed

@@ -1,18 +1,67 @@
 """
 Free LLM Providers for SPARKNET
-Supports multiple free-tier LLM providers:
-1. HuggingFace Inference API (free, no payment required)
-2. Groq (free tier - very fast)
-3. Google Gemini (free tier)
-4. Local/Offline mode (simulated responses)
 """
 import os
 import requests
-from typing import Optional, Tuple, List
 from dataclasses import dataclass
 from loguru import logger
 @dataclass
 class LLMResponse:
@@ -21,69 +70,346 @@ class LLMResponse:
     provider: str
     success: bool
     error: Optional[str] = None
 class HuggingFaceProvider:
     """
-    HuggingFace Inference API - FREE tier available.
-    Models that work well on free tier:
-    - microsoft/DialoGPT-medium
-    - google/flan-t5-base
-    - mistralai/Mistral-7B-Instruct-v0.2 (may need Pro for heavy use)
-    - HuggingFaceH4/zephyr-7b-beta
     """
     API_URL = "https://api-inference.huggingface.co/models/"
-    # Free-tier friendly models
     MODELS = {
-        "chat": "HuggingFaceH4/zephyr-7b-beta",
-        "chat_small": "microsoft/DialoGPT-medium",
-        "instruct": "google/flan-t5-large",
         "embed": "sentence-transformers/all-MiniLM-L6-v2",
     }
-    def __init__(self, api_token: Optional[str] = None):
-        """
-        Initialize HuggingFace provider.
-        Args:
-            api_token: HF token (optional but recommended for higher rate limits)
-                      Get free token at: https://huggingface.co/settings/tokens
-        """
-        self.api_token = api_token or os.environ.get("HUGGINGFACE_TOKEN") or os.environ.get("HF_TOKEN")
-        self.headers = {}
-        if self.api_token:
-            self.headers["Authorization"] = f"Bearer {self.api_token}"
-    def generate(self, prompt: str, model: Optional[str] = None, max_tokens: int = 500) -> LLMResponse:
-        """Generate text using HuggingFace Inference API."""
-        model = model or self.MODELS["chat"]
         url = f"{self.API_URL}{model}"
-        payload = {
-            "inputs": prompt,
-            "parameters": {
-                "max_new_tokens": max_tokens,
-                "temperature": 0.7,
-                "do_sample": True,
-                "return_full_text": False,
-            }
-        }
         try:
-            response = requests.post(url, headers=self.headers, json=payload, timeout=60)
             if response.status_code == 503:
-                # Model is loading
-                return LLMResponse(
-                    text="Model is loading, please try again in a moment...",
-                    model=model,
-                    provider="huggingface",
-                    success=False,
-                    error="Model loading"
-                )
             response.raise_for_status()
             result = response.json()
@@ -93,164 +419,200 @@ class HuggingFaceProvider:
             else:
                 text = str(result)
-            return LLMResponse(
-                text=text,
-                model=model,
-                provider="huggingface",
-                success=True
-            )
         except Exception as e:
-            logger.error(f"HuggingFace API error: {e}")
-            return LLMResponse(
-                text="",
-                model=model,
-                provider="huggingface",
-                success=False,
-                error=str(e)
-            )
     def embed(self, texts: List[str], model: Optional[str] = None) -> Tuple[List[List[float]], Optional[str]]:
-        """Generate embeddings using HuggingFace."""
         model = model or self.MODELS["embed"]
         url = f"{self.API_URL}{model}"
-        payload = {
-            "inputs": texts,
-            "options": {"wait_for_model": True}
-        }
         try:
-            response = requests.post(url, headers=self.headers, json=payload, timeout=60)
             response.raise_for_status()
-            embeddings = response.json()
-            return embeddings, None
         except Exception as e:
-            logger.error(f"HuggingFace embed error: {e}")
             return [], str(e)
-class GroqProvider:
     """
-    Groq - FREE tier with very fast inference.
-    Free tier includes:
-    - 14,400 requests/day for smaller models
-    - Very fast inference (fastest available)
-    Get free API key at: https://console.groq.com/keys
     """
-    API_URL = "https://api.groq.com/openai/v1/chat/completions"
     MODELS = {
-        "fast": "llama-3.1-8b-instant",  # Fastest
-        "smart": "llama-3.3-70b-versatile",  # Best quality
-        "small": "gemma2-9b-it",  # Good balance
     }
     def __init__(self, api_key: Optional[str] = None):
-        self.api_key = api_key or os.environ.get("GROQ_API_KEY")
-        if not self.api_key:
-            logger.warning("No Groq API key found. Get free key at: https://console.groq.com/keys")
-    def generate(self, prompt: str, model: Optional[str] = None, max_tokens: int = 500) -> LLMResponse:
-        """Generate text using Groq API."""
-        if not self.api_key:
-            return LLMResponse(
-                text="",
-                model="",
-                provider="groq",
-                success=False,
-                error="No Groq API key configured"
-            )
-        model = model or self.MODELS["fast"]
-        headers = {
-            "Authorization": f"Bearer {self.api_key}",
-            "Content-Type": "application/json"
-        }
-        payload = {
-            "model": model,
-            "messages": [{"role": "user", "content": prompt}],
-            "max_tokens": max_tokens,
-            "temperature": 0.7,
-        }
         try:
-            response = requests.post(self.API_URL, headers=headers, json=payload, timeout=30)
             response.raise_for_status()
             result = response.json()
-            text = result["choices"][0]["message"]["content"]
             return LLMResponse(
-                text=text,
                 model=model,
-                provider="groq",
-                success=True
             )
         except Exception as e:
-            logger.error(f"Groq API error: {e}")
             return LLMResponse(
-                text="",
                 model=model,
-                provider="groq",
-                success=False,
-                error=str(e)
             )
 class OfflineProvider:
     """
-    Offline/Demo mode - no API required.
-    Provides simulated responses for demonstration purposes.
     """
     def __init__(self):
-        pass
-    def generate(self, prompt: str, context: str = "", **kwargs) -> LLMResponse:
-        """Generate a simulated response based on context."""
-        # Extract key information from context if provided
         if context:
-            # Simple extractive response
-            sentences = context.split('.')
-            relevant = [s.strip() for s in sentences if len(s.strip()) > 20][:3]
-            if relevant:
-                response = f"Based on the documents, {relevant[0].lower()}."
-                if len(relevant) > 1:
-                    response += f" Additionally, {relevant[1].lower()}."
             else:
-                response = "Based on the available documents, I found relevant information but cannot generate a detailed response in offline mode."
         else:
-            response = "I'm running in offline demo mode. To get AI-powered responses, please configure a free LLM provider (HuggingFace or Groq)."
-        return LLMResponse(
-            text=response,
-            model="offline",
-            provider="offline",
-            success=True
-        )
     def embed(self, texts: List[str]) -> Tuple[List[List[float]], Optional[str]]:
-        """Generate simple bag-of-words style embeddings for demo."""
         import hashlib
         embeddings = []
         for text in texts:
-            # Create deterministic pseudo-embeddings based on text hash
             hash_bytes = hashlib.sha256(text.encode()).digest()
-            # Convert to 384-dim vector (same as MiniLM)
-            embedding = [((b % 200) - 100) / 100.0 for b in hash_bytes * 12][:384]
             embeddings.append(embedding)
         return embeddings, None
@@ -258,74 +620,104 @@ class UnifiedLLMProvider:
     """
     Unified interface for all LLM providers.
-    Automatically selects the best available provider.
     """
     def __init__(self):
-        self.providers = {}
-        self.active_provider = None
-        self.active_embed_provider = None
-        # Try to initialize providers in order of preference
         self._init_providers()
     def _init_providers(self):
-        """Initialize available providers."""
-        # Check for Groq (fastest, generous free tier)
-        groq_key = os.environ.get("GROQ_API_KEY")
-        if groq_key:
-            self.providers["groq"] = GroqProvider(groq_key)
-            self.active_provider = "groq"
-            logger.info("Using Groq provider (free tier)")
-        # Check for HuggingFace (always available, even without token)
-        hf_token = os.environ.get("HUGGINGFACE_TOKEN") or os.environ.get("HF_TOKEN")
-        self.providers["huggingface"] = HuggingFaceProvider(hf_token)
         if not self.active_provider:
-            self.active_provider = "huggingface"
-            logger.info("Using HuggingFace provider")
-        # HuggingFace for embeddings (always free)
         self.active_embed_provider = "huggingface"
-        # Offline fallback
-        self.providers["offline"] = OfflineProvider()
-        logger.info(f"LLM Provider: {self.active_provider}, Embed Provider: {self.active_embed_provider}")
-    def generate(self, prompt: str, **kwargs) -> LLMResponse:
-        """Generate text using the best available provider."""
-        provider = self.providers.get(self.active_provider)
-        if provider:
-            response = provider.generate(prompt, **kwargs)
             if response.success:
                 return response
-        # Fallback to offline
         return self.providers["offline"].generate(prompt, **kwargs)
     def embed(self, texts: List[str]) -> Tuple[List[List[float]], Optional[str]]:
-        """Generate embeddings using the best available provider."""
-        if self.active_embed_provider == "huggingface":
-            embeddings, error = self.providers["huggingface"].embed(texts)
-            if not error:
-                return embeddings, None
-        # Fallback to offline embeddings
         return self.providers["offline"].embed(texts)
-    def get_status(self) -> dict:
         """Get status of all providers."""
-        return {
             "active_llm": self.active_provider,
             "active_embed": self.active_embed_provider,
-            "available_providers": list(self.providers.keys()),
-            "groq_configured": "groq" in self.providers and self.providers["groq"].api_key is not None,
-            "huggingface_configured": self.providers["huggingface"].api_token is not None,
         }
 # Global instance
 _llm_provider: Optional[UnifiedLLMProvider] = None
@@ -337,3 +729,39 @@ def get_llm_provider() -> UnifiedLLMProvider:
     if _llm_provider is None:
         _llm_provider = UnifiedLLMProvider()
     return _llm_provider

 """
 Free LLM Providers for SPARKNET
+Supports multiple FREE-tier LLM providers:
+1. Groq - Very fast, generous free tier (14,400 req/day)
+2. Google Gemini - 15 req/min free
+3. OpenRouter - Access to many free models
+4. GitHub Models - Free GPT-4o, Llama access
+5. HuggingFace Inference API - Thousands of free models
+6. Together AI - $25 free credits
+7. Mistral AI - Free experiment plan
+8. Offline mode - No API required
+SECURITY & PRIVACY CONSIDERATIONS
+==================================
+GDPR COMPLIANCE:
+- Cloud LLM providers may process data outside the EU
+- For GDPR-sensitive workloads, use:
+  1. Offline mode with local Ollama
+  2. EU-hosted providers (when available)
+  3. Data anonymization before API calls
+- Consider data processing agreements with LLM providers
+- Implement data minimization - only send necessary context
+DATA ISOLATION OPTIONS:
+1. FULLY LOCAL (Maximum Privacy):
+   - Use Ollama for 100% on-premise inference
+   - No data transmitted to external services
+   - Configure: set no cloud API keys, system uses offline mode
+2. HYBRID (Balanced):
+   - Use local Ollama for sensitive documents
+   - Use cloud LLMs for general queries
+   - Implement document classification for routing
+3. CLOUD-ONLY (Convenience):
+   - All inference via cloud providers
+   - Suitable for non-sensitive/public data
+   - Review provider privacy policies
+PRIVATE DEPLOYMENT NOTES:
+- For enterprise deployments, configure Ollama on internal network
+- Use VPN/private endpoints for database connections
+- Enable audit logging for all LLM interactions
+- Implement rate limiting and access controls
+STREAMLIT CLOUD DEPLOYMENT:
+- Store API keys in Streamlit secrets (secrets.toml)
+- Never commit secrets to version control
+- Use environment variables as fallback
+- Enable session-based authentication
+Author: SPARKNET Team
+Project: VISTA/Horizon EU
 """
 import os
 import requests
+from typing import Optional, Tuple, List, Dict, Any
 from dataclasses import dataclass
 from loguru import logger
+import streamlit as st
 @dataclass
 class LLMResponse:
     provider: str
     success: bool
     error: Optional[str] = None
+    usage: Optional[Dict[str, int]] = None
+def get_secret(key: str, default: str = None) -> Optional[str]:
+    """Get secret from Streamlit secrets or environment."""
+    # Try Streamlit secrets first
+    try:
+        if hasattr(st, 'secrets') and key in st.secrets:
+            return st.secrets[key]
+    except:
+        pass
+    # Fall back to environment
+    return os.environ.get(key, default)
+class GroqProvider:
+    """
+    Groq - FREE tier with very fast inference.
+    Free tier: 14,400 requests/day, 300+ tokens/sec
+    Get free key: https://console.groq.com/keys
+    """
+    API_URL = "https://api.groq.com/openai/v1/chat/completions"
+    MODELS = {
+        "llama-3.3-70b": "llama-3.3-70b-versatile",
+        "llama-3.1-8b": "llama-3.1-8b-instant",
+        "mixtral": "mixtral-8x7b-32768",
+        "gemma2": "gemma2-9b-it",
+    }
+    def __init__(self, api_key: Optional[str] = None):
+        self.api_key = api_key or get_secret("GROQ_API_KEY")
+        self.name = "Groq"
+    @property
+    def is_configured(self) -> bool:
+        return bool(self.api_key)
+    def generate(self, prompt: str, model: Optional[str] = None, max_tokens: int = 1024,
+                 system_prompt: str = None) -> LLMResponse:
+        if not self.api_key:
+            return LLMResponse("", "", self.name, False, "No Groq API key")
+        model = model or self.MODELS["llama-3.1-8b"]
+        messages = []
+        if system_prompt:
+            messages.append({"role": "system", "content": system_prompt})
+        messages.append({"role": "user", "content": prompt})
+        try:
+            response = requests.post(
+                self.API_URL,
+                headers={
+                    "Authorization": f"Bearer {self.api_key}",
+                    "Content-Type": "application/json"
+                },
+                json={
+                    "model": model,
+                    "messages": messages,
+                    "max_tokens": max_tokens,
+                    "temperature": 0.7,
+                },
+                timeout=30
+            )
+            response.raise_for_status()
+            result = response.json()
+            return LLMResponse(
+                text=result["choices"][0]["message"]["content"],
+                model=model,
+                provider=self.name,
+                success=True,
+                usage=result.get("usage")
+            )
+        except Exception as e:
+            return LLMResponse("", model, self.name, False, str(e))
+class GoogleGeminiProvider:
+    """
+    Google AI Studio (Gemini) - FREE tier.
+    Free tier: ~15 requests/min, Gemini 2.0 Flash & 1.5 Pro
+    Get free key: https://aistudio.google.com/apikey
+    """
+    API_URL = "https://generativelanguage.googleapis.com/v1beta/models"
+    MODELS = {
+        "gemini-2.0-flash": "gemini-2.0-flash-exp",
+        "gemini-1.5-flash": "gemini-1.5-flash",
+        "gemini-1.5-pro": "gemini-1.5-pro",
+    }
+    def __init__(self, api_key: Optional[str] = None):
+        self.api_key = api_key or get_secret("GOOGLE_API_KEY") or get_secret("GEMINI_API_KEY")
+        self.name = "Google Gemini"
+    @property
+    def is_configured(self) -> bool:
+        return bool(self.api_key)
+    def generate(self, prompt: str, model: Optional[str] = None, max_tokens: int = 1024,
+                 system_prompt: str = None) -> LLMResponse:
+        if not self.api_key:
+            return LLMResponse("", "", self.name, False, "No Google API key")
+        model = model or self.MODELS["gemini-1.5-flash"]
+        # Build content
+        contents = []
+        if system_prompt:
+            contents.append({"role": "user", "parts": [{"text": system_prompt}]})
+            contents.append({"role": "model", "parts": [{"text": "Understood. I will follow these instructions."}]})
+        contents.append({"role": "user", "parts": [{"text": prompt}]})
+        try:
+            url = f"{self.API_URL}/{model}:generateContent?key={self.api_key}"
+            response = requests.post(
+                url,
+                json={
+                    "contents": contents,
+                    "generationConfig": {
+                        "maxOutputTokens": max_tokens,
+                        "temperature": 0.7,
+                    }
+                },
+                timeout=60
+            )
+            response.raise_for_status()
+            result = response.json()
+            text = result["candidates"][0]["content"]["parts"][0]["text"]
+            return LLMResponse(
+                text=text,
+                model=model,
+                provider=self.name,
+                success=True
+            )
+        except Exception as e:
+            return LLMResponse("", model, self.name, False, str(e))
+class OpenRouterProvider:
+    """
+    OpenRouter - Access to many FREE models with single API key.
+    Free models include: Llama, Mistral, Gemma, and more
+    Get free key: https://openrouter.ai/keys
+    """
+    API_URL = "https://openrouter.ai/api/v1/chat/completions"
+    # Free models on OpenRouter
+    MODELS = {
+        "llama-3.1-8b": "meta-llama/llama-3.1-8b-instruct:free",
+        "gemma-2-9b": "google/gemma-2-9b-it:free",
+        "mistral-7b": "mistralai/mistral-7b-instruct:free",
+        "phi-3-mini": "microsoft/phi-3-mini-128k-instruct:free",
+        "qwen-2-7b": "qwen/qwen-2-7b-instruct:free",
+    }
+    def __init__(self, api_key: Optional[str] = None):
+        self.api_key = api_key or get_secret("OPENROUTER_API_KEY")
+        self.name = "OpenRouter"
+    @property
+    def is_configured(self) -> bool:
+        return bool(self.api_key)
+    def generate(self, prompt: str, model: Optional[str] = None, max_tokens: int = 1024,
+                 system_prompt: str = None) -> LLMResponse:
+        if not self.api_key:
+            return LLMResponse("", "", self.name, False, "No OpenRouter API key")
+        model = model or self.MODELS["llama-3.1-8b"]
+        messages = []
+        if system_prompt:
+            messages.append({"role": "system", "content": system_prompt})
+        messages.append({"role": "user", "content": prompt})
+        try:
+            response = requests.post(
+                self.API_URL,
+                headers={
+                    "Authorization": f"Bearer {self.api_key}",
+                    "Content-Type": "application/json",
+                    "HTTP-Referer": "https://sparknet.streamlit.app",
+                    "X-Title": "SPARKNET"
+                },
+                json={
+                    "model": model,
+                    "messages": messages,
+                    "max_tokens": max_tokens,
+                },
+                timeout=60
+            )
+            response.raise_for_status()
+            result = response.json()
+            return LLMResponse(
+                text=result["choices"][0]["message"]["content"],
+                model=model,
+                provider=self.name,
+                success=True,
+                usage=result.get("usage")
+            )
+        except Exception as e:
+            return LLMResponse("", model, self.name, False, str(e))
+class GitHubModelsProvider:
+    """
+    GitHub Models - FREE access to top-tier models.
+    Free models: GPT-4o, Llama 3.1, Mistral, and more
+    Get token: https://github.com/settings/tokens (with 'models' scope)
+    """
+    API_URL = "https://models.inference.ai.azure.com/chat/completions"
+    MODELS = {
+        "gpt-4o": "gpt-4o",
+        "gpt-4o-mini": "gpt-4o-mini",
+        "llama-3.1-70b": "Meta-Llama-3.1-70B-Instruct",
+        "llama-3.1-8b": "Meta-Llama-3.1-8B-Instruct",
+        "mistral-large": "Mistral-large",
+    }
+    def __init__(self, api_key: Optional[str] = None):
+        self.api_key = api_key or get_secret("GITHUB_TOKEN") or get_secret("GITHUB_MODELS_TOKEN")
+        self.name = "GitHub Models"
+    @property
+    def is_configured(self) -> bool:
+        return bool(self.api_key)
+    def generate(self, prompt: str, model: Optional[str] = None, max_tokens: int = 1024,
+                 system_prompt: str = None) -> LLMResponse:
+        if not self.api_key:
+            return LLMResponse("", "", self.name, False, "No GitHub token")
+        model = model or self.MODELS["gpt-4o-mini"]
+        messages = []
+        if system_prompt:
+            messages.append({"role": "system", "content": system_prompt})
+        messages.append({"role": "user", "content": prompt})
+        try:
+            response = requests.post(
+                self.API_URL,
+                headers={
+                    "Authorization": f"Bearer {self.api_key}",
+                    "Content-Type": "application/json"
+                },
+                json={
+                    "model": model,
+                    "messages": messages,
+                    "max_tokens": max_tokens,
+                },
+                timeout=60
+            )
+            response.raise_for_status()
+            result = response.json()
+            return LLMResponse(
+                text=result["choices"][0]["message"]["content"],
+                model=model,
+                provider=self.name,
+                success=True,
+                usage=result.get("usage")
+            )
+        except Exception as e:
+            return LLMResponse("", model, self.name, False, str(e))
 class HuggingFaceProvider:
     """
+    HuggingFace Inference API - FREE access to thousands of models.
+    Get free token: https://huggingface.co/settings/tokens
     """
     API_URL = "https://api-inference.huggingface.co/models/"
     MODELS = {
+        "zephyr-7b": "HuggingFaceH4/zephyr-7b-beta",
+        "mistral-7b": "mistralai/Mistral-7B-Instruct-v0.2",
+        "llama-2-7b": "meta-llama/Llama-2-7b-chat-hf",
+        "flan-t5": "google/flan-t5-large",
         "embed": "sentence-transformers/all-MiniLM-L6-v2",
     }
+    def __init__(self, api_key: Optional[str] = None):
+        self.api_key = api_key or get_secret("HF_TOKEN") or get_secret("HUGGINGFACE_TOKEN")
+        self.name = "HuggingFace"
+    @property
+    def is_configured(self) -> bool:
+        return bool(self.api_key)
+    def generate(self, prompt: str, model: Optional[str] = None, max_tokens: int = 500,
+                 system_prompt: str = None) -> LLMResponse:
+        model = model or self.MODELS["zephyr-7b"]
         url = f"{self.API_URL}{model}"
+        # Format prompt with system instruction
+        full_prompt = prompt
+        if system_prompt:
+            full_prompt = f"{system_prompt}\n\nUser: {prompt}\nAssistant:"
+        headers = {"Content-Type": "application/json"}
+        if self.api_key:
+            headers["Authorization"] = f"Bearer {self.api_key}"
         try:
+            response = requests.post(
+                url,
+                headers=headers,
+                json={
+                    "inputs": full_prompt,
+                    "parameters": {
+                        "max_new_tokens": max_tokens,
+                        "temperature": 0.7,
+                        "do_sample": True,
+                        "return_full_text": False,
+                    },
+                    "options": {"wait_for_model": True}
+                },
+                timeout=120
+            )
             if response.status_code == 503:
+                return LLMResponse("", model, self.name, False, "Model is loading, try again")
             response.raise_for_status()
             result = response.json()
             else:
                 text = str(result)
+            return LLMResponse(text=text, model=model, provider=self.name, success=True)
         except Exception as e:
+            return LLMResponse("", model, self.name, False, str(e))
     def embed(self, texts: List[str], model: Optional[str] = None) -> Tuple[List[List[float]], Optional[str]]:
+        """Generate embeddings."""
         model = model or self.MODELS["embed"]
         url = f"{self.API_URL}{model}"
+        headers = {"Content-Type": "application/json"}
+        if self.api_key:
+            headers["Authorization"] = f"Bearer {self.api_key}"
         try:
+            response = requests.post(
+                url,
+                headers=headers,
+                json={"inputs": texts, "options": {"wait_for_model": True}},
+                timeout=60
+            )
             response.raise_for_status()
+            return response.json(), None
         except Exception as e:
             return [], str(e)
+class TogetherAIProvider:
     """
+    Together AI - $25 FREE credits.
+    Access to Llama, Mistral, and many other models
+    Get free credits: https://www.together.ai/
     """
+    API_URL = "https://api.together.xyz/v1/chat/completions"
     MODELS = {
+        "llama-3.1-8b": "meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo",
+        "llama-3.1-70b": "meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo",
+        "mistral-7b": "mistralai/Mistral-7B-Instruct-v0.3",
+        "qwen-2-72b": "Qwen/Qwen2-72B-Instruct",
     }
     def __init__(self, api_key: Optional[str] = None):
+        self.api_key = api_key or get_secret("TOGETHER_API_KEY")
+        self.name = "Together AI"
+    @property
+    def is_configured(self) -> bool:
+        return bool(self.api_key)
+    def generate(self, prompt: str, model: Optional[str] = None, max_tokens: int = 1024,
+                 system_prompt: str = None) -> LLMResponse:
+        if not self.api_key:
+            return LLMResponse("", "", self.name, False, "No Together AI API key")
+        model = model or self.MODELS["llama-3.1-8b"]
+        messages = []
+        if system_prompt:
+            messages.append({"role": "system", "content": system_prompt})
+        messages.append({"role": "user", "content": prompt})
         try:
+            response = requests.post(
+                self.API_URL,
+                headers={
+                    "Authorization": f"Bearer {self.api_key}",
+                    "Content-Type": "application/json"
+                },
+                json={
+                    "model": model,
+                    "messages": messages,
+                    "max_tokens": max_tokens,
+                    "temperature": 0.7,
+                },
+                timeout=60
+            )
             response.raise_for_status()
             result = response.json()
             return LLMResponse(
+                text=result["choices"][0]["message"]["content"],
                 model=model,
+                provider=self.name,
+                success=True,
+                usage=result.get("usage")
             )
         except Exception as e:
+            return LLMResponse("", model, self.name, False, str(e))
+class MistralAIProvider:
+    """
+    Mistral AI - FREE "Experiment" plan.
+    Get free access: https://console.mistral.ai/
+    """
+    API_URL = "https://api.mistral.ai/v1/chat/completions"
+    MODELS = {
+        "mistral-small": "mistral-small-latest",
+        "mistral-medium": "mistral-medium-latest",
+        "mistral-large": "mistral-large-latest",
+        "codestral": "codestral-latest",
+    }
+    def __init__(self, api_key: Optional[str] = None):
+        self.api_key = api_key or get_secret("MISTRAL_API_KEY")
+        self.name = "Mistral AI"
+    @property
+    def is_configured(self) -> bool:
+        return bool(self.api_key)
+    def generate(self, prompt: str, model: Optional[str] = None, max_tokens: int = 1024,
+                 system_prompt: str = None) -> LLMResponse:
+        if not self.api_key:
+            return LLMResponse("", "", self.name, False, "No Mistral API key")
+        model = model or self.MODELS["mistral-small"]
+        messages = []
+        if system_prompt:
+            messages.append({"role": "system", "content": system_prompt})
+        messages.append({"role": "user", "content": prompt})
+        try:
+            response = requests.post(
+                self.API_URL,
+                headers={
+                    "Authorization": f"Bearer {self.api_key}",
+                    "Content-Type": "application/json"
+                },
+                json={
+                    "model": model,
+                    "messages": messages,
+                    "max_tokens": max_tokens,
+                },
+                timeout=60
+            )
+            response.raise_for_status()
+            result = response.json()
             return LLMResponse(
+                text=result["choices"][0]["message"]["content"],
                 model=model,
+                provider=self.name,
+                success=True,
+                usage=result.get("usage")
             )
+        except Exception as e:
+            return LLMResponse("", model, self.name, False, str(e))
 class OfflineProvider:
     """
+    Offline/Demo mode - No API required.
+    Provides extractive responses from context for demonstration.
     """
     def __init__(self):
+        self.name = "Offline"
+    @property
+    def is_configured(self) -> bool:
+        return True
+    def generate(self, prompt: str, context: str = "", **kwargs) -> LLMResponse:
         if context:
+            sentences = [s.strip() for s in context.split('.') if len(s.strip()) > 20][:3]
+            if sentences:
+                response = f"Based on the documents: {sentences[0]}."
+                if len(sentences) > 1:
+                    response += f" Additionally, {sentences[1].lower()}."
             else:
+                response = "I found relevant information but cannot generate a detailed response in offline mode."
         else:
+            response = ("I'm running in offline demo mode. Configure a free LLM provider "
+                       "(Groq, Gemini, OpenRouter, etc.) for AI-powered responses.")
+        return LLMResponse(text=response, model="offline", provider=self.name, success=True)
     def embed(self, texts: List[str]) -> Tuple[List[List[float]], Optional[str]]:
+        """Generate simple hash-based embeddings for demo."""
         import hashlib
         embeddings = []
         for text in texts:
             hash_bytes = hashlib.sha256(text.encode()).digest()
+            embedding = [((b % 200) - 100) / 100.0 for b in (hash_bytes * 12)][:384]
             embeddings.append(embedding)
         return embeddings, None
     """
     Unified interface for all LLM providers.
+    Automatically selects the best available provider based on configured API keys.
+    Priority: Groq > Gemini > OpenRouter > GitHub > Together > Mistral > HuggingFace > Offline
     """
     def __init__(self):
+        self.providers: Dict[str, Any] = {}
+        self.active_provider: Optional[str] = None
+        self.active_embed_provider: Optional[str] = None
         self._init_providers()
     def _init_providers(self):
+        """Initialize all available providers."""
+        # Initialize providers in priority order
+        provider_classes = [
+            ("groq", GroqProvider),
+            ("gemini", GoogleGeminiProvider),
+            ("openrouter", OpenRouterProvider),
+            ("github", GitHubModelsProvider),
+            ("together", TogetherAIProvider),
+            ("mistral", MistralAIProvider),
+            ("huggingface", HuggingFaceProvider),
+            ("offline", OfflineProvider),
+        ]
+        for name, cls in provider_classes:
+            try:
+                provider = cls()
+                self.providers[name] = provider
+                # Set active provider (first configured one)
+                if provider.is_configured and not self.active_provider and name != "offline":
+                    self.active_provider = name
+                    logger.info(f"Active LLM provider: {provider.name}")
+            except Exception as e:
+                logger.warning(f"Failed to init {name}: {e}")
+        # Fallback to offline if nothing configured
         if not self.active_provider:
+            self.active_provider = "offline"
+            logger.warning("No LLM API configured, using offline mode")
+        # HuggingFace for embeddings (works without token too)
         self.active_embed_provider = "huggingface"
+    def generate(self, prompt: str, provider: str = None, **kwargs) -> LLMResponse:
+        """Generate text using specified or best available provider."""
+        provider_name = provider or self.active_provider
+        if provider_name and provider_name in self.providers:
+            response = self.providers[provider_name].generate(prompt, **kwargs)
             if response.success:
                 return response
+            logger.warning(f"{provider_name} failed: {response.error}")
+        # Fallback chain
+        for name in ["groq", "gemini", "openrouter", "huggingface", "offline"]:
+            if name in self.providers and name != provider_name:
+                response = self.providers[name].generate(prompt, **kwargs)
+                if response.success:
+                    return response
         return self.providers["offline"].generate(prompt, **kwargs)
     def embed(self, texts: List[str]) -> Tuple[List[List[float]], Optional[str]]:
+        """Generate embeddings."""
+        if self.active_embed_provider and self.active_embed_provider in self.providers:
+            provider = self.providers[self.active_embed_provider]
+            if hasattr(provider, 'embed'):
+                result, error = provider.embed(texts)
+                if not error:
+                    return result, None
+        # Fallback to offline
         return self.providers["offline"].embed(texts)
+    def get_status(self) -> Dict[str, Any]:
         """Get status of all providers."""
+        status = {
             "active_llm": self.active_provider,
+            "active_llm_name": self.providers[self.active_provider].name if self.active_provider else "None",
             "active_embed": self.active_embed_provider,
+            "providers": {}
         }
+        for name, provider in self.providers.items():
+            status["providers"][name] = {
+                "name": provider.name,
+                "configured": provider.is_configured,
+            }
+        return status
+    def list_available(self) -> List[str]:
+        """List all configured providers."""
+        return [name for name, p in self.providers.items() if p.is_configured and name != "offline"]
 # Global instance
 _llm_provider: Optional[UnifiedLLMProvider] = None
     if _llm_provider is None:
         _llm_provider = UnifiedLLMProvider()
     return _llm_provider
+def generate_response(prompt: str, context: str = "", system_prompt: str = None) -> Tuple[str, Optional[str]]:
+    """
+    Convenience function to generate a response.
+    Args:
+        prompt: User prompt
+        context: Optional context from retrieved documents
+        system_prompt: Optional system instruction
+    Returns:
+        Tuple of (response_text, error_message)
+    """
+    provider = get_llm_provider()
+    # Build full prompt with context
+    if context:
+        full_prompt = f"""Context from documents:
+{context}
+Question: {prompt}
+Please answer based on the context provided. If the answer is not in the context, say so."""
+    else:
+        full_prompt = prompt
+    if not system_prompt:
+        system_prompt = "You are a helpful document analysis assistant. Provide accurate, concise answers based on the provided context."
+    response = provider.generate(full_prompt, system_prompt=system_prompt)
+    if response.success:
+        return response.text, None
+    else:
+        return "", response.error

src/agents/scenario3/__init__.py ADDED Viewed

	@@ -0,0 +1,42 @@

+"""
+SPARKNET Scenario 3: License Compliance Monitoring
+This module provides AI-powered license agreement monitoring and compliance tracking
+for Technology Transfer Offices (TTOs).
+FEATURES (Planned):
+- Payment Tracking: Monitor royalty payments and fee schedules
+- Milestone Verification: Track contractual milestones and deliverables
+- Revenue Alerts: Automated alerts for payment anomalies and thresholds
+- Compliance Reporting: Generate compliance reports for stakeholders
+GDPR/PRIVACY CONSIDERATIONS:
+- All license data should be stored with appropriate access controls
+- Payment information requires encryption at rest and in transit
+- Consider data retention policies for completed agreements
+- Audit logging recommended for compliance tracking actions
+VISTA/HORIZON EU ALIGNMENT:
+- Supports European research valorization objectives
+- Designed for university TTO workflows
+- Integrates with existing agreement safety checks
+DEPLOYMENT OPTIONS:
+- Cloud: Use with encrypted secrets via Streamlit Cloud
+- Private: On-premise deployment with local database
+- Hybrid: Cloud UI with on-premise data storage
+Author: SPARKNET Team
+Project: VISTA/Horizon EU
+Status: Placeholder - In Development
+"""
+from .license_compliance_agent import LicenseComplianceAgent
+from .payment_tracker import PaymentTracker
+from .milestone_verifier import MilestoneVerifier
+__all__ = [
+    "LicenseComplianceAgent",
+    "PaymentTracker",
+    "MilestoneVerifier",
+]

src/agents/scenario3/license_compliance_agent.py ADDED Viewed

	@@ -0,0 +1,321 @@

+"""
+License Compliance Agent for SPARKNET
+AI-powered license agreement monitoring and compliance verification.
+Part of Scenario 3: License Compliance Monitoring.
+PRIVACY & SECURITY NOTES:
+-------------------------
+This agent handles sensitive financial and contractual data. For production deployments:
+1. DATA ISOLATION:
+   - License data should be stored in isolated database schemas
+   - Implement row-level security for multi-tenant deployments
+   - Consider geographic data residency requirements
+2. GDPR COMPLIANCE:
+   - Implement right-to-erasure for terminated agreements
+   - Maintain data processing records
+   - Enable data portability exports
+3. AUDIT REQUIREMENTS:
+   - All compliance checks should be logged
+   - Maintain immutable audit trail
+   - Enable compliance report generation
+4. PRIVATE DEPLOYMENT:
+   - Use Ollama for local LLM inference (no data leaves network)
+   - Configure local vector store for document embeddings
+   - Implement on-premise authentication
+Author: SPARKNET Team
+Project: VISTA/Horizon EU
+Status: Placeholder - In Development
+"""
+from typing import Optional, Dict, Any, List
+from dataclasses import dataclass
+from datetime import datetime, date
+from enum import Enum
+from loguru import logger
+# Note: These imports would be implemented when the module is fully developed
+# from ..base_agent import BaseAgent, Task
+# from ...llm.langchain_ollama_client import LangChainOllamaClient
+class ComplianceStatus(str, Enum):
+    """License compliance status."""
+    COMPLIANT = "compliant"
+    NON_COMPLIANT = "non_compliant"
+    AT_RISK = "at_risk"
+    PENDING_REVIEW = "pending_review"
+    EXPIRED = "expired"
+class PaymentStatus(str, Enum):
+    """Payment tracking status."""
+    PAID = "paid"
+    PENDING = "pending"
+    OVERDUE = "overdue"
+    DISPUTED = "disputed"
+    WAIVED = "waived"
+@dataclass
+class LicenseAgreement:
+    """
+    License agreement data model.
+    GDPR Note: Contains potentially sensitive business information.
+    Implement appropriate access controls and retention policies.
+    """
+    license_id: str
+    agreement_name: str
+    licensee_name: str
+    licensor_name: str
+    technology_name: str
+    effective_date: date
+    expiration_date: Optional[date]
+    status: ComplianceStatus
+    total_value: Optional[float]
+    currency: str = "EUR"
+    payment_schedule: Optional[List[Dict[str, Any]]] = None
+    milestones: Optional[List[Dict[str, Any]]] = None
+    metadata: Optional[Dict[str, Any]] = None
+@dataclass
+class PaymentRecord:
+    """
+    Payment tracking record.
+    GDPR Note: Financial data - ensure encryption and access logging.
+    """
+    payment_id: str
+    license_id: str
+    amount: float
+    currency: str
+    due_date: date
+    paid_date: Optional[date]
+    status: PaymentStatus
+    payment_type: str  # royalty, upfront, milestone, etc.
+    notes: Optional[str] = None
+@dataclass
+class ComplianceAlert:
+    """
+    Compliance monitoring alert.
+    Used for notifying TTO staff of compliance issues.
+    """
+    alert_id: str
+    license_id: str
+    alert_type: str  # payment_overdue, milestone_missed, expiring_soon, etc.
+    severity: str  # low, medium, high, critical
+    message: str
+    created_at: datetime
+    resolved: bool = False
+    resolved_at: Optional[datetime] = None
+class LicenseComplianceAgent:
+    """
+    Agent for monitoring license agreement compliance.
+    This agent tracks:
+    - Payment schedules and overdue payments
+    - Milestone completion and deadlines
+    - Agreement expiration dates
+    - Compliance violations and alerts
+    DEPLOYMENT CONSIDERATIONS:
+    --------------------------
+    For private/on-premise deployment:
+    1. Configure local Ollama instance for LLM inference
+    2. Use PostgreSQL with encryption for data storage
+    3. Implement SSO integration for authentication
+    4. Enable audit logging for all operations
+    For cloud deployment (Streamlit Cloud):
+    1. Use secrets management for API keys
+    2. Configure secure database connection
+    3. Enable HTTPS for all communications
+    4. Implement rate limiting for API calls
+    """
+    def __init__(
+        self,
+        llm_client: Optional[Any] = None,  # LangChainOllamaClient when implemented
+        database_url: Optional[str] = None,
+    ):
+        """
+        Initialize License Compliance Agent.
+        Args:
+            llm_client: LangChain LLM client for AI analysis
+            database_url: Database connection URL (use secrets management)
+        """
+        self.llm_client = llm_client
+        self.database_url = database_url
+        self.name = "LicenseComplianceAgent"
+        self.description = "License agreement monitoring and compliance tracking"
+        logger.info(f"Initialized {self.name} (placeholder)")
+    async def check_payment_compliance(
+        self,
+        license_id: str,
+    ) -> Dict[str, Any]:
+        """
+        Check payment compliance for a license agreement.
+        Args:
+            license_id: License agreement identifier
+        Returns:
+            Compliance status with payment details
+        TODO: Implement actual payment tracking logic
+        """
+        logger.info(f"Checking payment compliance for license: {license_id}")
+        # Placeholder response
+        return {
+            "license_id": license_id,
+            "status": ComplianceStatus.PENDING_REVIEW.value,
+            "message": "Payment compliance check not yet implemented",
+            "payments_due": [],
+            "payments_overdue": [],
+            "next_payment_date": None,
+            "total_outstanding": 0.0,
+        }
+    async def verify_milestone(
+        self,
+        license_id: str,
+        milestone_id: str,
+    ) -> Dict[str, Any]:
+        """
+        Verify milestone completion for a license agreement.
+        Args:
+            license_id: License agreement identifier
+            milestone_id: Milestone identifier
+        Returns:
+            Milestone verification result
+        TODO: Implement actual milestone verification logic
+        """
+        logger.info(f"Verifying milestone {milestone_id} for license: {license_id}")
+        # Placeholder response
+        return {
+            "license_id": license_id,
+            "milestone_id": milestone_id,
+            "status": "pending_verification",
+            "message": "Milestone verification not yet implemented",
+            "evidence_required": True,
+            "verification_deadline": None,
+        }
+    async def generate_compliance_report(
+        self,
+        license_ids: Optional[List[str]] = None,
+        date_range: Optional[tuple] = None,
+    ) -> Dict[str, Any]:
+        """
+        Generate compliance report for license agreements.
+        Args:
+            license_ids: Optional list of specific licenses to report on
+            date_range: Optional (start_date, end_date) tuple
+        Returns:
+            Compliance report with summary and details
+        TODO: Implement actual report generation logic
+        """
+        logger.info("Generating compliance report")
+        # Placeholder response
+        return {
+            "report_id": f"report_{datetime.now().strftime('%Y%m%d_%H%M%S')}",
+            "generated_at": datetime.now().isoformat(),
+            "status": "placeholder",
+            "message": "Compliance report generation not yet implemented",
+            "summary": {
+                "total_licenses": 0,
+                "compliant": 0,
+                "non_compliant": 0,
+                "at_risk": 0,
+            },
+            "details": [],
+        }
+    async def create_alert(
+        self,
+        license_id: str,
+        alert_type: str,
+        severity: str,
+        message: str,
+    ) -> ComplianceAlert:
+        """
+        Create a compliance alert for TTO staff notification.
+        Args:
+            license_id: License agreement identifier
+            alert_type: Type of alert (payment_overdue, milestone_missed, etc.)
+            severity: Alert severity (low, medium, high, critical)
+            message: Alert message
+        Returns:
+            Created compliance alert
+        TODO: Implement actual alert creation and notification logic
+        """
+        logger.info(f"Creating {severity} alert for license: {license_id}")
+        alert = ComplianceAlert(
+            alert_id=f"alert_{datetime.now().strftime('%Y%m%d_%H%M%S')}",
+            license_id=license_id,
+            alert_type=alert_type,
+            severity=severity,
+            message=message,
+            created_at=datetime.now(),
+        )
+        # TODO: Send notification (email, Slack, etc.)
+        return alert
+    def get_vista_quality_criteria(self) -> Dict[str, Any]:
+        """
+        Get VISTA quality criteria for compliance monitoring.
+        Returns quality thresholds aligned with VISTA project objectives.
+        """
+        return {
+            "payment_tracking": {
+                "weight": 0.30,
+                "threshold": 0.95,
+                "description": "Payment records must be accurate and complete",
+            },
+            "milestone_verification": {
+                "weight": 0.25,
+                "threshold": 0.90,
+                "description": "Milestone verification must include evidence",
+            },
+            "alert_timeliness": {
+                "weight": 0.25,
+                "threshold": 0.95,
+                "description": "Alerts must be generated within 24 hours of trigger",
+            },
+            "report_accuracy": {
+                "weight": 0.20,
+                "threshold": 0.98,
+                "description": "Reports must accurately reflect current state",
+            },
+        }

src/agents/scenario3/milestone_verifier.py ADDED Viewed

	@@ -0,0 +1,344 @@

+"""
+Milestone Verifier for License Compliance Monitoring
+Tracks and verifies contractual milestones for license agreements.
+FEATURES (Planned):
+- Milestone definition and tracking
+- Evidence collection for verification
+- Deadline monitoring and alerts
+- Integration with CriticAgent for validation
+VISTA/HORIZON EU ALIGNMENT:
+- Supports milestone-based payment structures common in EU research
+- Integrates with project management workflows
+- Provides audit trail for milestone verification
+Author: SPARKNET Team
+Project: VISTA/Horizon EU
+Status: Placeholder - In Development
+"""
+from typing import Optional, Dict, Any, List
+from dataclasses import dataclass, field
+from datetime import datetime, date
+from enum import Enum
+from loguru import logger
+class MilestoneStatus(str, Enum):
+    """Milestone tracking status."""
+    PENDING = "pending"
+    IN_PROGRESS = "in_progress"
+    SUBMITTED = "submitted"  # Awaiting verification
+    VERIFIED = "verified"
+    REJECTED = "rejected"
+    WAIVED = "waived"
+    OVERDUE = "overdue"
+class MilestoneType(str, Enum):
+    """Type of milestone."""
+    TECHNICAL = "technical"  # Technical deliverable
+    COMMERCIAL = "commercial"  # Commercial target
+    REGULATORY = "regulatory"  # Regulatory approval
+    FINANCIAL = "financial"  # Financial target
+    REPORTING = "reporting"  # Report submission
+    OTHER = "other"
+@dataclass
+class Milestone:
+    """
+    License agreement milestone definition.
+    Represents a contractual milestone that must be achieved
+    for license compliance.
+    """
+    milestone_id: str
+    license_id: str
+    title: str
+    description: str
+    milestone_type: MilestoneType
+    due_date: date
+    status: MilestoneStatus = MilestoneStatus.PENDING
+    payment_trigger: bool = False  # If true, triggers milestone payment
+    payment_amount: Optional[float] = None
+    currency: str = "EUR"
+    evidence_required: List[str] = field(default_factory=list)
+    evidence_submitted: List[Dict[str, Any]] = field(default_factory=list)
+    verified_by: Optional[str] = None
+    verified_at: Optional[datetime] = None
+    notes: Optional[str] = None
+    metadata: Dict[str, Any] = field(default_factory=dict)
+@dataclass
+class VerificationResult:
+    """
+    Result of milestone verification.
+    Includes CriticAgent validation scores when available.
+    """
+    verification_id: str
+    milestone_id: str
+    verified: bool
+    confidence_score: float  # 0.0 to 1.0
+    verification_notes: str
+    evidence_review: List[Dict[str, Any]]
+    critic_validation: Optional[Dict[str, Any]] = None  # CriticAgent output
+    human_review_required: bool = False
+    verified_at: datetime = field(default_factory=datetime.now)
+class MilestoneVerifier:
+    """
+    Verifies milestone completion for license agreements.
+    This component:
+    - Tracks milestone deadlines
+    - Collects and reviews evidence
+    - Integrates with CriticAgent for AI validation
+    - Implements human-in-the-loop for critical decisions
+    HUMAN-IN-THE-LOOP CONSIDERATIONS:
+    ----------------------------------
+    Milestone verification often requires human judgment.
+    This component implements:
+    1. AUTOMATED VERIFICATION:
+       - Document completeness checks
+       - Format and structure validation
+       - Cross-reference with requirements
+    2. AI-ASSISTED REVIEW:
+       - CriticAgent evaluates evidence quality
+       - Confidence scoring for verification
+       - Anomaly detection in submissions
+    3. HUMAN DECISION POINTS:
+       - Low-confidence verifications flagged for review
+       - High-value milestones require approval
+       - Rejection decisions need human confirmation
+    4. AUDIT TRAIL:
+       - All decisions logged with reasoning
+       - Evidence preserved for compliance
+       - Verification history maintained
+    """
+    def __init__(
+        self,
+        llm_client: Optional[Any] = None,
+        critic_agent: Optional[Any] = None,  # CriticAgent for validation
+        database_url: Optional[str] = None,
+    ):
+        """
+        Initialize Milestone Verifier.
+        Args:
+            llm_client: LangChain LLM client for AI analysis
+            critic_agent: CriticAgent for validation
+            database_url: Database connection URL
+        """
+        self.llm_client = llm_client
+        self.critic_agent = critic_agent
+        self.database_url = database_url
+        self.name = "MilestoneVerifier"
+        # Threshold for requiring human review
+        self.human_review_threshold = 0.7
+        logger.info(f"Initialized {self.name} (placeholder)")
+    async def create_milestone(
+        self,
+        license_id: str,
+        title: str,
+        description: str,
+        milestone_type: MilestoneType,
+        due_date: date,
+        evidence_required: List[str],
+        payment_trigger: bool = False,
+        payment_amount: Optional[float] = None,
+    ) -> Milestone:
+        """
+        Create a new milestone for a license agreement.
+        Args:
+            license_id: License agreement identifier
+            title: Milestone title
+            description: Detailed description
+            milestone_type: Type of milestone
+            due_date: Deadline for completion
+            evidence_required: List of required evidence types
+            payment_trigger: Whether completion triggers payment
+            payment_amount: Payment amount if payment_trigger is True
+        Returns:
+            Created milestone
+        TODO: Implement actual milestone creation logic
+        """
+        logger.info(f"Creating milestone '{title}' for license: {license_id}")
+        return Milestone(
+            milestone_id=f"ms_{datetime.now().strftime('%Y%m%d_%H%M%S')}",
+            license_id=license_id,
+            title=title,
+            description=description,
+            milestone_type=milestone_type,
+            due_date=due_date,
+            evidence_required=evidence_required,
+            payment_trigger=payment_trigger,
+            payment_amount=payment_amount,
+        )
+    async def submit_evidence(
+        self,
+        milestone_id: str,
+        evidence_type: str,
+        evidence_data: Dict[str, Any],
+        submitted_by: str,
+    ) -> Dict[str, Any]:
+        """
+        Submit evidence for milestone verification.
+        Args:
+            milestone_id: Milestone identifier
+            evidence_type: Type of evidence being submitted
+            evidence_data: Evidence data (documents, metrics, etc.)
+            submitted_by: User/organization submitting
+        Returns:
+            Submission confirmation
+        TODO: Implement actual evidence submission logic
+        """
+        logger.info(f"Submitting {evidence_type} evidence for milestone: {milestone_id}")
+        # Placeholder response
+        return {
+            "submission_id": f"sub_{datetime.now().strftime('%Y%m%d_%H%M%S')}",
+            "milestone_id": milestone_id,
+            "evidence_type": evidence_type,
+            "submitted_at": datetime.now().isoformat(),
+            "submitted_by": submitted_by,
+            "status": "received",
+            "message": "Evidence submission not yet fully implemented",
+        }
+    async def verify_milestone(
+        self,
+        milestone_id: str,
+        auto_approve: bool = False,
+    ) -> VerificationResult:
+        """
+        Verify milestone completion using AI and human review.
+        This method:
+        1. Checks all required evidence is submitted
+        2. Uses CriticAgent to validate evidence quality
+        3. Calculates confidence score
+        4. Determines if human review is needed
+        Args:
+            milestone_id: Milestone to verify
+            auto_approve: Whether to auto-approve high-confidence verifications
+        Returns:
+            Verification result with confidence score
+        TODO: Implement actual verification logic with CriticAgent
+        """
+        logger.info(f"Verifying milestone: {milestone_id}")
+        # Placeholder verification result
+        result = VerificationResult(
+            verification_id=f"ver_{datetime.now().strftime('%Y%m%d_%H%M%S')}",
+            milestone_id=milestone_id,
+            verified=False,
+            confidence_score=0.0,
+            verification_notes="Verification not yet implemented",
+            evidence_review=[],
+            human_review_required=True,
+        )
+        return result
+    async def get_overdue_milestones(
+        self,
+        as_of_date: Optional[date] = None,
+    ) -> List[Milestone]:
+        """
+        Get list of overdue milestones.
+        Args:
+            as_of_date: Reference date (defaults to today)
+        Returns:
+            List of overdue milestones
+        TODO: Implement actual overdue milestone tracking
+        """
+        as_of_date = as_of_date or date.today()
+        logger.info(f"Checking overdue milestones as of {as_of_date}")
+        # Placeholder response
+        return []
+    async def get_upcoming_milestones(
+        self,
+        days_ahead: int = 30,
+    ) -> List[Milestone]:
+        """
+        Get milestones due in the near future.
+        Args:
+            days_ahead: Number of days to look ahead
+        Returns:
+            List of upcoming milestones
+        TODO: Implement actual upcoming milestone tracking
+        """
+        logger.info(f"Getting milestones due in next {days_ahead} days")
+        # Placeholder response
+        return []
+    def requires_human_review(
+        self,
+        confidence_score: float,
+        milestone: Milestone,
+    ) -> bool:
+        """
+        Determine if milestone verification requires human review.
+        Human review is required when:
+        - Confidence score is below threshold
+        - Milestone triggers large payment
+        - Milestone type is regulatory
+        - Evidence is incomplete or ambiguous
+        Args:
+            confidence_score: AI verification confidence
+            milestone: Milestone being verified
+        Returns:
+            True if human review required
+        """
+        # Low confidence requires review
+        if confidence_score < self.human_review_threshold:
+            return True
+        # Large payments require review
+        if milestone.payment_trigger and milestone.payment_amount:
+            if milestone.payment_amount > 50000:  # EUR threshold
+                return True
+        # Regulatory milestones always require review
+        if milestone.milestone_type == MilestoneType.REGULATORY:
+            return True
+        return False

src/agents/scenario3/payment_tracker.py ADDED Viewed

	@@ -0,0 +1,277 @@

+"""
+Payment Tracker for License Compliance Monitoring
+Tracks royalty payments, fees, and financial obligations for license agreements.
+SECURITY CONSIDERATIONS:
+------------------------
+This module handles sensitive financial data. Ensure:
+1. All payment data is encrypted at rest (AES-256 recommended)
+2. Access is logged for audit compliance
+3. PCI-DSS guidelines are followed if processing card data
+4. Data retention policies are implemented
+GDPR COMPLIANCE:
+---------------
+- Payment records may contain personal data of signatories
+- Implement data minimization - only store necessary fields
+- Support data portability and right-to-erasure requests
+- Maintain records of processing activities
+Author: SPARKNET Team
+Project: VISTA/Horizon EU
+Status: Placeholder - In Development
+"""
+from typing import Optional, Dict, Any, List
+from dataclasses import dataclass, field
+from datetime import datetime, date, timedelta
+from enum import Enum
+from loguru import logger
+class PaymentFrequency(str, Enum):
+    """Payment schedule frequency."""
+    ONE_TIME = "one_time"
+    MONTHLY = "monthly"
+    QUARTERLY = "quarterly"
+    SEMI_ANNUAL = "semi_annual"
+    ANNUAL = "annual"
+    MILESTONE_BASED = "milestone_based"
+class RevenueType(str, Enum):
+    """Type of revenue/payment."""
+    UPFRONT_FEE = "upfront_fee"
+    ROYALTY = "royalty"
+    MILESTONE_PAYMENT = "milestone_payment"
+    MAINTENANCE_FEE = "maintenance_fee"
+    SUBLICENSE_FEE = "sublicense_fee"
+    MINIMUM_PAYMENT = "minimum_payment"
+@dataclass
+class PaymentSchedule:
+    """
+    Payment schedule configuration for a license agreement.
+    GDPR Note: May reference personal data (contact info).
+    Implement appropriate access controls.
+    """
+    schedule_id: str
+    license_id: str
+    frequency: PaymentFrequency
+    revenue_type: RevenueType
+    base_amount: Optional[float] = None
+    percentage_rate: Optional[float] = None  # For royalties
+    currency: str = "EUR"
+    start_date: Optional[date] = None
+    end_date: Optional[date] = None
+    payment_terms_days: int = 30  # Days until payment is due
+    late_fee_percentage: float = 0.0
+    minimum_payment: Optional[float] = None
+    metadata: Dict[str, Any] = field(default_factory=dict)
+@dataclass
+class RevenueAlert:
+    """
+    Revenue monitoring alert configuration.
+    Used to notify TTO staff of payment anomalies.
+    """
+    alert_id: str
+    license_id: str
+    alert_type: str  # threshold_exceeded, payment_overdue, anomaly_detected
+    threshold_value: Optional[float] = None
+    comparison_operator: str = "greater_than"  # greater_than, less_than, equals
+    notification_channels: List[str] = field(default_factory=list)  # email, slack, sms
+    enabled: bool = True
+class PaymentTracker:
+    """
+    Tracks payments and revenue for license agreements.
+    This component:
+    - Records incoming payments
+    - Tracks overdue payments
+    - Generates revenue alerts
+    - Produces financial reports
+    PRIVATE DEPLOYMENT NOTES:
+    -------------------------
+    For on-premise deployment:
+    1. Use local PostgreSQL with SSL/TLS
+    2. Implement database connection pooling
+    3. Configure backup and disaster recovery
+    4. Set up monitoring for payment processing
+    For enhanced security:
+    1. Use hardware security modules (HSM) for encryption keys
+    2. Implement IP allowlisting for database access
+    3. Enable query auditing
+    4. Configure intrusion detection
+    """
+    def __init__(
+        self,
+        database_url: Optional[str] = None,
+        notification_service: Optional[Any] = None,
+    ):
+        """
+        Initialize Payment Tracker.
+        Args:
+            database_url: Secure database connection URL
+            notification_service: Service for sending alerts
+        """
+        self.database_url = database_url
+        self.notification_service = notification_service
+        self.name = "PaymentTracker"
+        logger.info(f"Initialized {self.name} (placeholder)")
+    async def record_payment(
+        self,
+        license_id: str,
+        amount: float,
+        currency: str,
+        payment_date: date,
+        revenue_type: RevenueType,
+        reference: Optional[str] = None,
+    ) -> Dict[str, Any]:
+        """
+        Record a payment received for a license agreement.
+        Args:
+            license_id: License agreement identifier
+            amount: Payment amount
+            currency: Currency code (EUR, USD, etc.)
+            payment_date: Date payment was received
+            revenue_type: Type of revenue
+            reference: Payment reference/invoice number
+        Returns:
+            Payment record confirmation
+        TODO: Implement actual payment recording logic
+        """
+        logger.info(f"Recording payment of {amount} {currency} for license: {license_id}")
+        # Placeholder response
+        return {
+            "payment_id": f"pmt_{datetime.now().strftime('%Y%m%d_%H%M%S')}",
+            "license_id": license_id,
+            "amount": amount,
+            "currency": currency,
+            "payment_date": payment_date.isoformat(),
+            "revenue_type": revenue_type.value,
+            "reference": reference,
+            "status": "recorded",
+            "message": "Payment recording not yet fully implemented",
+        }
+    async def get_overdue_payments(
+        self,
+        as_of_date: Optional[date] = None,
+        days_overdue: int = 0,
+    ) -> List[Dict[str, Any]]:
+        """
+        Get list of overdue payments.
+        Args:
+            as_of_date: Reference date (defaults to today)
+            days_overdue: Minimum days overdue to include
+        Returns:
+            List of overdue payment records
+        TODO: Implement actual overdue payment tracking
+        """
+        as_of_date = as_of_date or date.today()
+        logger.info(f"Checking overdue payments as of {as_of_date}")
+        # Placeholder response
+        return []
+    async def calculate_revenue_summary(
+        self,
+        start_date: date,
+        end_date: date,
+        group_by: str = "month",
+    ) -> Dict[str, Any]:
+        """
+        Calculate revenue summary for a date range.
+        Args:
+            start_date: Start of reporting period
+            end_date: End of reporting period
+            group_by: Grouping period (day, week, month, quarter, year)
+        Returns:
+            Revenue summary with breakdowns
+        TODO: Implement actual revenue calculation
+        """
+        logger.info(f"Calculating revenue from {start_date} to {end_date}")
+        # Placeholder response
+        return {
+            "period": {
+                "start": start_date.isoformat(),
+                "end": end_date.isoformat(),
+            },
+            "total_revenue": 0.0,
+            "currency": "EUR",
+            "by_period": [],
+            "by_revenue_type": {},
+            "by_license": {},
+            "status": "placeholder",
+            "message": "Revenue calculation not yet implemented",
+        }
+    async def create_revenue_alert(
+        self,
+        license_id: str,
+        alert_type: str,
+        threshold: float,
+        notification_channels: List[str],
+    ) -> RevenueAlert:
+        """
+        Create a revenue monitoring alert.
+        Args:
+            license_id: License to monitor
+            alert_type: Type of alert
+            threshold: Threshold value
+            notification_channels: Where to send alerts
+        Returns:
+            Created alert configuration
+        TODO: Implement actual alert creation logic
+        """
+        logger.info(f"Creating revenue alert for license: {license_id}")
+        return RevenueAlert(
+            alert_id=f"ralert_{datetime.now().strftime('%Y%m%d_%H%M%S')}",
+            license_id=license_id,
+            alert_type=alert_type,
+            threshold_value=threshold,
+            notification_channels=notification_channels,
+        )
+    async def check_revenue_thresholds(self) -> List[Dict[str, Any]]:
+        """
+        Check all revenue alerts and generate notifications.
+        Returns:
+            List of triggered alerts
+        TODO: Implement actual threshold checking
+        """
+        logger.info("Checking revenue thresholds")
+        # Placeholder response
+        return []

src/agents/scenario4/__init__.py ADDED Viewed

	@@ -0,0 +1,36 @@

+"""
+SPARKNET Scenario 4: Award Identification
+This module provides AI-powered funding opportunity discovery and award
+nomination assistance for Technology Transfer Offices (TTOs).
+FEATURES (Planned):
+- Opportunity Scanning: Automated discovery of funding opportunities
+- Nomination Assistance: AI-assisted award nomination preparation
+- Deadline Tracking: Calendar integration for application deadlines
+- Application Support: Document preparation and review
+VISTA/HORIZON EU ALIGNMENT:
+- Designed for European research funding landscape
+- Supports Horizon Europe, ERC, and national funding programs
+- Integrates with TTO workflows for commercialization grants
+DEPLOYMENT OPTIONS:
+- Cloud: Streamlit Cloud with API integrations
+- Private: On-premise with local databases
+- Hybrid: Cloud scanning with on-premise data storage
+Author: SPARKNET Team
+Project: VISTA/Horizon EU
+Status: Placeholder - In Development
+"""
+from .award_identification_agent import AwardIdentificationAgent
+from .opportunity_scanner import OpportunityScanner
+from .nomination_assistant import NominationAssistant
+__all__ = [
+    "AwardIdentificationAgent",
+    "OpportunityScanner",
+    "NominationAssistant",
+]

src/agents/scenario4/award_identification_agent.py ADDED Viewed

	@@ -0,0 +1,392 @@

+"""
+Award Identification Agent for SPARKNET
+AI-powered funding opportunity discovery and award nomination assistance.
+Part of Scenario 4: Award Identification.
+FEATURES:
+---------
+1. OPPORTUNITY DISCOVERY:
+   - Scan funding databases and announcements
+   - Match opportunities to research capabilities
+   - Track application deadlines
+2. NOMINATION ASSISTANCE:
+   - Prepare award nomination documents
+   - Review and validate submissions
+   - Generate supporting materials
+3. APPLICATION SUPPORT:
+   - Document preparation workflows
+   - Compliance checking
+   - Reviewer matching
+INTEGRATIONS (Planned):
+-----------------------
+- Horizon Europe CORDIS database
+- National funding agency APIs
+- ERC portal integration
+- Patent databases for innovation evidence
+Author: SPARKNET Team
+Project: VISTA/Horizon EU
+Status: Placeholder - In Development
+"""
+from typing import Optional, Dict, Any, List
+from dataclasses import dataclass, field
+from datetime import datetime, date
+from enum import Enum
+from loguru import logger
+class OpportunityType(str, Enum):
+    """Type of funding opportunity."""
+    GRANT = "grant"
+    AWARD = "award"
+    FELLOWSHIP = "fellowship"
+    PRIZE = "prize"
+    INVESTMENT = "investment"
+    PARTNERSHIP = "partnership"
+class OpportunityStatus(str, Enum):
+    """Opportunity tracking status."""
+    IDENTIFIED = "identified"
+    EVALUATING = "evaluating"
+    PREPARING = "preparing"
+    SUBMITTED = "submitted"
+    AWARDED = "awarded"
+    REJECTED = "rejected"
+    EXPIRED = "expired"
+class EligibilityStatus(str, Enum):
+    """Eligibility assessment status."""
+    ELIGIBLE = "eligible"
+    INELIGIBLE = "ineligible"
+    PARTIAL = "partial"  # Some criteria met
+    UNKNOWN = "unknown"  # Needs review
+@dataclass
+class FundingOpportunity:
+    """
+    Funding opportunity data model.
+    Represents a grant, award, or other funding opportunity
+    identified by the scanning system.
+    """
+    opportunity_id: str
+    title: str
+    description: str
+    opportunity_type: OpportunityType
+    funder: str
+    funder_type: str  # government, foundation, corporate, EU, etc.
+    amount_min: Optional[float] = None
+    amount_max: Optional[float] = None
+    currency: str = "EUR"
+    deadline: Optional[date] = None
+    url: Optional[str] = None
+    eligibility_criteria: List[str] = field(default_factory=list)
+    keywords: List[str] = field(default_factory=list)
+    status: OpportunityStatus = OpportunityStatus.IDENTIFIED
+    match_score: Optional[float] = None  # How well it matches capabilities
+    notes: Optional[str] = None
+    metadata: Dict[str, Any] = field(default_factory=dict)
+@dataclass
+class OpportunityMatch:
+    """
+    Match between opportunity and research/technology.
+    Represents alignment between a funding opportunity
+    and institutional capabilities.
+    """
+    match_id: str
+    opportunity_id: str
+    technology_id: Optional[str] = None
+    research_area: Optional[str] = None
+    match_score: float = 0.0  # 0.0 to 1.0
+    match_rationale: str = ""
+    eligibility_status: EligibilityStatus = EligibilityStatus.UNKNOWN
+    eligibility_notes: List[str] = field(default_factory=list)
+    recommended_action: str = ""
+    confidence_score: float = 0.0
+@dataclass
+class NominationDocument:
+    """
+    Award nomination document.
+    Contains structured content for award/grant applications.
+    """
+    document_id: str
+    opportunity_id: str
+    document_type: str  # proposal, nomination_letter, cv, budget, etc.
+    title: str
+    content: str
+    version: str = "1.0"
+    status: str = "draft"  # draft, review, final
+    created_at: datetime = field(default_factory=datetime.now)
+    updated_at: datetime = field(default_factory=datetime.now)
+    created_by: Optional[str] = None
+    reviewer_comments: List[Dict[str, Any]] = field(default_factory=list)
+    critic_validation: Optional[Dict[str, Any]] = None
+class AwardIdentificationAgent:
+    """
+    Agent for identifying funding opportunities and assisting nominations.
+    This agent:
+    - Scans funding databases for opportunities
+    - Matches opportunities to research capabilities
+    - Assists with nomination document preparation
+    - Tracks application deadlines and status
+    HUMAN-IN-THE-LOOP WORKFLOW:
+    ---------------------------
+    Award applications require human judgment. This agent implements:
+    1. AUTOMATED SCANNING:
+       - Regular scans of funding databases
+       - Keyword matching and filtering
+       - Initial eligibility screening
+    2. AI-ASSISTED MATCHING:
+       - Score opportunities against capabilities
+       - Generate match rationale
+       - Identify gaps and risks
+    3. HUMAN DECISION POINTS:
+       - Approval to pursue opportunities
+       - Review of application documents
+       - Final submission authorization
+    4. QUALITY ASSURANCE:
+       - CriticAgent validation of documents
+       - Compliance checking
+       - Reviewer feedback integration
+    """
+    def __init__(
+        self,
+        llm_client: Optional[Any] = None,
+        critic_agent: Optional[Any] = None,
+        database_url: Optional[str] = None,
+    ):
+        """
+        Initialize Award Identification Agent.
+        Args:
+            llm_client: LangChain LLM client for AI analysis
+            critic_agent: CriticAgent for document validation
+            database_url: Database connection URL
+        """
+        self.llm_client = llm_client
+        self.critic_agent = critic_agent
+        self.database_url = database_url
+        self.name = "AwardIdentificationAgent"
+        self.description = "Funding opportunity discovery and nomination assistance"
+        logger.info(f"Initialized {self.name} (placeholder)")
+    async def scan_opportunities(
+        self,
+        keywords: Optional[List[str]] = None,
+        opportunity_types: Optional[List[OpportunityType]] = None,
+        min_amount: Optional[float] = None,
+        max_deadline_days: Optional[int] = None,
+    ) -> List[FundingOpportunity]:
+        """
+        Scan for funding opportunities matching criteria.
+        Args:
+            keywords: Keywords to search for
+            opportunity_types: Types of opportunities to find
+            min_amount: Minimum funding amount
+            max_deadline_days: Maximum days until deadline
+        Returns:
+            List of matching opportunities
+        TODO: Implement actual opportunity scanning
+        """
+        logger.info(f"Scanning for opportunities with keywords: {keywords}")
+        # Placeholder - would integrate with funding databases
+        return []
+    async def match_opportunity(
+        self,
+        opportunity_id: str,
+        technology_ids: Optional[List[str]] = None,
+        research_areas: Optional[List[str]] = None,
+    ) -> OpportunityMatch:
+        """
+        Evaluate match between opportunity and capabilities.
+        Args:
+            opportunity_id: Opportunity to evaluate
+            technology_ids: Technologies to consider
+            research_areas: Research areas to consider
+        Returns:
+            Match result with score and rationale
+        TODO: Implement actual matching logic
+        """
+        logger.info(f"Matching opportunity: {opportunity_id}")
+        # Placeholder response
+        return OpportunityMatch(
+            match_id=f"match_{datetime.now().strftime('%Y%m%d_%H%M%S')}",
+            opportunity_id=opportunity_id,
+            match_score=0.0,
+            match_rationale="Matching not yet implemented",
+            eligibility_status=EligibilityStatus.UNKNOWN,
+            recommended_action="Review manually",
+            confidence_score=0.0,
+        )
+    async def check_eligibility(
+        self,
+        opportunity_id: str,
+        applicant_profile: Dict[str, Any],
+    ) -> Dict[str, Any]:
+        """
+        Check eligibility for a funding opportunity.
+        Args:
+            opportunity_id: Opportunity to check
+            applicant_profile: Profile of potential applicant
+        Returns:
+            Eligibility assessment with details
+        TODO: Implement actual eligibility checking
+        """
+        logger.info(f"Checking eligibility for opportunity: {opportunity_id}")
+        # Placeholder response
+        return {
+            "opportunity_id": opportunity_id,
+            "status": EligibilityStatus.UNKNOWN.value,
+            "criteria_met": [],
+            "criteria_not_met": [],
+            "criteria_unknown": [],
+            "recommendation": "Manual review required",
+            "confidence": 0.0,
+        }
+    async def prepare_nomination(
+        self,
+        opportunity_id: str,
+        document_type: str,
+        context: Dict[str, Any],
+    ) -> NominationDocument:
+        """
+        Prepare a nomination/application document.
+        Args:
+            opportunity_id: Target opportunity
+            document_type: Type of document to prepare
+            context: Context information for document generation
+        Returns:
+            Generated nomination document
+        TODO: Implement actual document preparation with LLM
+        """
+        logger.info(f"Preparing {document_type} for opportunity: {opportunity_id}")
+        # Placeholder response
+        return NominationDocument(
+            document_id=f"doc_{datetime.now().strftime('%Y%m%d_%H%M%S')}",
+            opportunity_id=opportunity_id,
+            document_type=document_type,
+            title=f"{document_type.replace('_', ' ').title()} - Draft",
+            content="[Document content to be generated]",
+            status="draft",
+        )
+    async def validate_document(
+        self,
+        document: NominationDocument,
+    ) -> Dict[str, Any]:
+        """
+        Validate nomination document using CriticAgent.
+        Args:
+            document: Document to validate
+        Returns:
+            Validation result with suggestions
+        TODO: Implement CriticAgent integration
+        """
+        logger.info(f"Validating document: {document.document_id}")
+        # Placeholder response
+        return {
+            "document_id": document.document_id,
+            "valid": False,
+            "overall_score": 0.0,
+            "dimension_scores": {},
+            "issues": ["Validation not yet implemented"],
+            "suggestions": ["Complete document implementation"],
+            "human_review_required": True,
+        }
+    async def get_upcoming_deadlines(
+        self,
+        days_ahead: int = 30,
+    ) -> List[FundingOpportunity]:
+        """
+        Get opportunities with upcoming deadlines.
+        Args:
+            days_ahead: Number of days to look ahead
+        Returns:
+            List of opportunities with deadlines
+        TODO: Implement actual deadline tracking
+        """
+        logger.info(f"Getting deadlines for next {days_ahead} days")
+        # Placeholder response
+        return []
+    def get_vista_quality_criteria(self) -> Dict[str, Any]:
+        """
+        Get VISTA quality criteria for award identification.
+        Returns quality thresholds for opportunity matching and
+        document preparation aligned with VISTA objectives.
+        """
+        return {
+            "opportunity_relevance": {
+                "weight": 0.30,
+                "threshold": 0.75,
+                "description": "Opportunities must be relevant to research capabilities",
+            },
+            "eligibility_accuracy": {
+                "weight": 0.25,
+                "threshold": 0.90,
+                "description": "Eligibility assessments must be accurate",
+            },
+            "document_quality": {
+                "weight": 0.25,
+                "threshold": 0.85,
+                "description": "Nomination documents must meet quality standards",
+            },
+            "deadline_tracking": {
+                "weight": 0.20,
+                "threshold": 0.95,
+                "description": "Deadlines must be tracked accurately",
+            },
+        }

src/agents/scenario4/nomination_assistant.py ADDED Viewed

	@@ -0,0 +1,371 @@

+"""
+Nomination Assistant for Award Identification
+AI-assisted preparation of award nominations and grant applications.
+FEATURES (Planned):
+------------------
+1. DOCUMENT GENERATION:
+   - Executive summaries
+   - Project descriptions
+   - Budget justifications
+   - Team CVs and bios
+2. TEMPLATE MATCHING:
+   - Match to funder templates
+   - Format compliance checking
+   - Character/word limit validation
+3. QUALITY ASSURANCE:
+   - CriticAgent validation
+   - Reviewer simulation
+   - Gap identification
+4. COLLABORATION:
+   - Multi-author editing
+   - Comment and review workflows
+   - Version control
+HUMAN-IN-THE-LOOP:
+-----------------
+Document preparation requires extensive human input:
+- Initial content drafting
+- Review and revision cycles
+- Final approval before submission
+This assistant accelerates the process but doesn't replace
+human expertise in grant writing.
+Author: SPARKNET Team
+Project: VISTA/Horizon EU
+Status: Placeholder - In Development
+"""
+from typing import Optional, Dict, Any, List
+from dataclasses import dataclass, field
+from datetime import datetime
+from enum import Enum
+from loguru import logger
+class DocumentTemplate(str, Enum):
+    """Standard document templates."""
+    HORIZON_PROPOSAL = "horizon_proposal"
+    ERC_APPLICATION = "erc_application"
+    NATIONAL_GRANT = "national_grant"
+    AWARD_NOMINATION = "award_nomination"
+    LETTER_OF_INTENT = "letter_of_intent"
+    BUDGET_TEMPLATE = "budget_template"
+    CV_EUROPASS = "cv_europass"
+    CUSTOM = "custom"
+class ReviewStatus(str, Enum):
+    """Document review status."""
+    DRAFT = "draft"
+    INTERNAL_REVIEW = "internal_review"
+    REVISION_NEEDED = "revision_needed"
+    APPROVED = "approved"
+    SUBMITTED = "submitted"
+@dataclass
+class DocumentSection:
+    """
+    Section of a nomination document.
+    Represents a structured section with content and metadata.
+    """
+    section_id: str
+    title: str
+    content: str
+    word_limit: Optional[int] = None
+    current_words: int = 0
+    status: str = "draft"
+    ai_generated: bool = False
+    human_reviewed: bool = False
+    reviewer_comments: List[str] = field(default_factory=list)
+    suggestions: List[str] = field(default_factory=list)
+@dataclass
+class DocumentReview:
+    """
+    Review of a nomination document.
+    Contains feedback from AI and human reviewers.
+    """
+    review_id: str
+    document_id: str
+    reviewer_type: str  # "ai", "human", "external"
+    reviewer_name: Optional[str] = None
+    overall_score: Optional[float] = None
+    section_scores: Dict[str, float] = field(default_factory=dict)
+    strengths: List[str] = field(default_factory=list)
+    weaknesses: List[str] = field(default_factory=list)
+    suggestions: List[str] = field(default_factory=list)
+    decision: str = "pending"  # approve, revise, reject
+    created_at: datetime = field(default_factory=datetime.now)
+class NominationAssistant:
+    """
+    AI assistant for preparing nominations and applications.
+    This component:
+    - Generates document sections
+    - Checks format compliance
+    - Simulates reviewer feedback
+    - Manages revision workflows
+    INTEGRATION WITH CRITICAGENT:
+    -----------------------------
+    Uses CriticAgent for:
+    - Document quality validation
+    - Format compliance checking
+    - Reviewer perspective simulation
+    - Gap and weakness identification
+    CONFIDENCE SCORING:
+    ------------------
+    All AI-generated content includes:
+    - Confidence score (0.0-1.0)
+    - Source references where applicable
+    - Suggestions for improvement
+    - Flag for human review
+    Generated content with low confidence scores
+    is automatically flagged for human review.
+    """
+    def __init__(
+        self,
+        llm_client: Optional[Any] = None,
+        critic_agent: Optional[Any] = None,
+        template_library: Optional[Dict[str, Any]] = None,
+    ):
+        """
+        Initialize Nomination Assistant.
+        Args:
+            llm_client: LangChain LLM client for content generation
+            critic_agent: CriticAgent for validation
+            template_library: Library of document templates
+        """
+        self.llm_client = llm_client
+        self.critic_agent = critic_agent
+        self.template_library = template_library or {}
+        self.name = "NominationAssistant"
+        # Threshold for requiring human review
+        self.confidence_threshold = 0.7
+        logger.info(f"Initialized {self.name} (placeholder)")
+    async def generate_section(
+        self,
+        document_id: str,
+        section_type: str,
+        context: Dict[str, Any],
+        word_limit: Optional[int] = None,
+    ) -> DocumentSection:
+        """
+        Generate a document section using AI.
+        Args:
+            document_id: Parent document ID
+            section_type: Type of section to generate
+            context: Context information for generation
+            word_limit: Optional word limit
+        Returns:
+            Generated section with confidence score
+        TODO: Implement actual LLM generation
+        """
+        logger.info(f"Generating {section_type} section for document: {document_id}")
+        # Placeholder response
+        return DocumentSection(
+            section_id=f"sec_{datetime.now().strftime('%Y%m%d_%H%M%S')}",
+            title=section_type.replace("_", " ").title(),
+            content="[AI-generated content placeholder]",
+            word_limit=word_limit,
+            current_words=0,
+            status="draft",
+            ai_generated=True,
+            human_reviewed=False,
+            suggestions=["Complete implementation with actual LLM generation"],
+        )
+    async def check_format_compliance(
+        self,
+        document_id: str,
+        template: DocumentTemplate,
+    ) -> Dict[str, Any]:
+        """
+        Check document compliance with template requirements.
+        Args:
+            document_id: Document to check
+            template: Template to check against
+        Returns:
+            Compliance report with issues
+        TODO: Implement actual compliance checking
+        """
+        logger.info(f"Checking format compliance for document: {document_id}")
+        # Placeholder response
+        return {
+            "document_id": document_id,
+            "template": template.value,
+            "compliant": False,
+            "issues": [
+                {
+                    "type": "placeholder",
+                    "message": "Compliance checking not yet implemented",
+                    "severity": "info",
+                }
+            ],
+            "word_counts": {},
+            "missing_sections": [],
+        }
+    async def simulate_review(
+        self,
+        document_id: str,
+        reviewer_perspective: str = "general",
+    ) -> DocumentReview:
+        """
+        Simulate reviewer feedback using AI.
+        Generates feedback from the perspective of a grant
+        reviewer to identify potential weaknesses.
+        Args:
+            document_id: Document to review
+            reviewer_perspective: Type of reviewer to simulate
+        Returns:
+            Simulated review with scores and feedback
+        TODO: Implement actual review simulation
+        """
+        logger.info(f"Simulating {reviewer_perspective} review for document: {document_id}")
+        # Placeholder response
+        return DocumentReview(
+            review_id=f"rev_{datetime.now().strftime('%Y%m%d_%H%M%S')}",
+            document_id=document_id,
+            reviewer_type="ai",
+            reviewer_name=f"AI ({reviewer_perspective})",
+            overall_score=0.0,
+            strengths=["Review simulation not yet implemented"],
+            weaknesses=["Cannot assess without implementation"],
+            suggestions=["Complete the AI review simulation feature"],
+            decision="pending",
+        )
+    async def suggest_improvements(
+        self,
+        section: DocumentSection,
+    ) -> List[str]:
+        """
+        Suggest improvements for a document section.
+        Uses CriticAgent to analyze section and generate
+        actionable improvement suggestions.
+        Args:
+            section: Section to analyze
+        Returns:
+            List of improvement suggestions
+        TODO: Implement CriticAgent integration
+        """
+        logger.info(f"Generating improvement suggestions for section: {section.section_id}")
+        # Placeholder response
+        return [
+            "Improvement suggestions not yet implemented",
+            "Will integrate with CriticAgent for validation",
+        ]
+    async def validate_with_critic(
+        self,
+        document_id: str,
+    ) -> Dict[str, Any]:
+        """
+        Validate document using CriticAgent.
+        Performs comprehensive validation including:
+        - Content quality assessment
+        - Format compliance
+        - Logical consistency
+        - Citation verification
+        Args:
+            document_id: Document to validate
+        Returns:
+            Validation result with scores and issues
+        TODO: Implement CriticAgent integration
+        """
+        logger.info(f"Validating document with CriticAgent: {document_id}")
+        # Placeholder response
+        return {
+            "document_id": document_id,
+            "valid": False,
+            "overall_score": 0.0,
+            "dimension_scores": {
+                "completeness": 0.0,
+                "clarity": 0.0,
+                "accuracy": 0.0,
+                "compliance": 0.0,
+            },
+            "issues": ["CriticAgent validation not yet implemented"],
+            "suggestions": ["Complete CriticAgent integration"],
+            "human_review_required": True,
+            "confidence": 0.0,
+        }
+    def requires_human_review(
+        self,
+        confidence_score: float,
+        section_type: str,
+    ) -> bool:
+        """
+        Determine if content requires human review.
+        Human review is required when:
+        - Confidence is below threshold
+        - Section is critical (executive summary, budget)
+        - Content makes claims about capabilities
+        Args:
+            confidence_score: AI confidence score
+            section_type: Type of section
+        Returns:
+            True if human review required
+        """
+        # Low confidence always requires review
+        if confidence_score < self.confidence_threshold:
+            return True
+        # Critical sections always require review
+        critical_sections = [
+            "executive_summary",
+            "budget",
+            "team_qualifications",
+            "methodology",
+        ]
+        if section_type.lower() in critical_sections:
+            return True
+        return False

src/agents/scenario4/opportunity_scanner.py ADDED Viewed

	@@ -0,0 +1,278 @@

+"""
+Opportunity Scanner for Award Identification
+Scans funding databases and announcement sources for opportunities.
+PLANNED DATA SOURCES:
+---------------------
+- Horizon Europe / CORDIS
+- European Research Council (ERC)
+- National funding agencies (DFG, ANR, UKRI, etc.)
+- Foundation databases
+- Corporate R&D partnerships
+- Innovation prizes and awards
+SCANNING STRATEGY:
+-----------------
+1. KEYWORD MATCHING:
+   - Technology-specific terms
+   - Research area keywords
+   - Institution eligibility terms
+2. SEMANTIC SEARCH:
+   - Vector similarity to capability descriptions
+   - Cross-lingual matching for EU opportunities
+3. FILTERING:
+   - Deadline filtering (exclude expired)
+   - Amount thresholds
+   - Eligibility pre-screening
+Author: SPARKNET Team
+Project: VISTA/Horizon EU
+Status: Placeholder - In Development
+"""
+from typing import Optional, Dict, Any, List
+from dataclasses import dataclass, field
+from datetime import datetime, date
+from enum import Enum
+from loguru import logger
+class DataSource(str, Enum):
+    """Funding data sources."""
+    CORDIS = "cordis"  # Horizon Europe
+    ERC = "erc"  # European Research Council
+    NATIONAL = "national"  # National agencies
+    FOUNDATION = "foundation"
+    CORPORATE = "corporate"
+    CUSTOM = "custom"
+@dataclass
+class ScanConfiguration:
+    """
+    Configuration for opportunity scanning.
+    Defines what and how to scan for opportunities.
+    """
+    config_id: str
+    name: str
+    sources: List[DataSource]
+    keywords: List[str]
+    research_areas: List[str]
+    min_amount: Optional[float] = None
+    max_amount: Optional[float] = None
+    currency: str = "EUR"
+    exclude_expired: bool = True
+    include_rolling: bool = True  # Include opportunities with no fixed deadline
+    scan_frequency_hours: int = 24
+    last_scan: Optional[datetime] = None
+    enabled: bool = True
+@dataclass
+class ScanResult:
+    """
+    Result of an opportunity scan.
+    Contains discovered opportunities and scan metadata.
+    """
+    scan_id: str
+    config_id: str
+    started_at: datetime
+    completed_at: Optional[datetime] = None
+    sources_scanned: List[str] = field(default_factory=list)
+    opportunities_found: int = 0
+    new_opportunities: int = 0
+    updated_opportunities: int = 0
+    errors: List[str] = field(default_factory=list)
+    status: str = "in_progress"
+class OpportunityScanner:
+    """
+    Scans funding databases for opportunities.
+    This component:
+    - Connects to funding data sources
+    - Runs periodic scans
+    - Identifies new opportunities
+    - Updates existing opportunity data
+    INTEGRATION NOTES:
+    -----------------
+    For production deployment, integrate with:
+    1. HORIZON EUROPE (CORDIS):
+       - Use CORDIS API for call announcements
+       - Parse work programme documents
+       - Track topic deadlines
+    2. NATIONAL AGENCIES:
+       - DFG (Germany): RSS feeds
+       - ANR (France): Open data portal
+       - UKRI (UK): Gateway API
+    3. FOUNDATIONS:
+       - Scrape foundation websites
+       - Monitor RSS/newsletter feeds
+       - Parse PDF announcements
+    4. CUSTOM SOURCES:
+       - Support for institution-specific sources
+       - Private funding networks
+       - Industry partnership programs
+    """
+    def __init__(
+        self,
+        database_url: Optional[str] = None,
+        embedding_client: Optional[Any] = None,
+    ):
+        """
+        Initialize Opportunity Scanner.
+        Args:
+            database_url: Database for storing opportunities
+            embedding_client: Client for semantic search embeddings
+        """
+        self.database_url = database_url
+        self.embedding_client = embedding_client
+        self.name = "OpportunityScanner"
+        # Registered scan configurations
+        self.configurations: Dict[str, ScanConfiguration] = {}
+        logger.info(f"Initialized {self.name} (placeholder)")
+    async def register_configuration(
+        self,
+        config: ScanConfiguration,
+    ) -> None:
+        """
+        Register a scan configuration.
+        Args:
+            config: Scan configuration to register
+        """
+        self.configurations[config.config_id] = config
+        logger.info(f"Registered scan configuration: {config.name}")
+    async def run_scan(
+        self,
+        config_id: Optional[str] = None,
+    ) -> ScanResult:
+        """
+        Run an opportunity scan.
+        Args:
+            config_id: Specific configuration to use (or all if None)
+        Returns:
+            Scan result with discovered opportunities
+        TODO: Implement actual scanning logic
+        """
+        logger.info(f"Running opportunity scan (config: {config_id or 'all'})")
+        # Placeholder response
+        return ScanResult(
+            scan_id=f"scan_{datetime.now().strftime('%Y%m%d_%H%M%S')}",
+            config_id=config_id or "all",
+            started_at=datetime.now(),
+            completed_at=datetime.now(),
+            sources_scanned=[],
+            opportunities_found=0,
+            new_opportunities=0,
+            updated_opportunities=0,
+            status="placeholder",
+        )
+    async def scan_cordis(
+        self,
+        keywords: List[str],
+    ) -> List[Dict[str, Any]]:
+        """
+        Scan CORDIS for Horizon Europe opportunities.
+        Args:
+            keywords: Keywords to search for
+        Returns:
+            List of opportunities from CORDIS
+        TODO: Implement CORDIS API integration
+        """
+        logger.info(f"Scanning CORDIS with keywords: {keywords}")
+        # Placeholder - would use CORDIS API
+        return []
+    async def scan_erc(
+        self,
+        research_areas: List[str],
+    ) -> List[Dict[str, Any]]:
+        """
+        Scan ERC for grant opportunities.
+        Args:
+            research_areas: Research areas to match
+        Returns:
+            List of ERC opportunities
+        TODO: Implement ERC portal integration
+        """
+        logger.info(f"Scanning ERC for research areas: {research_areas}")
+        # Placeholder - would scrape ERC portal
+        return []
+    async def semantic_search(
+        self,
+        query: str,
+        sources: Optional[List[DataSource]] = None,
+        top_k: int = 10,
+    ) -> List[Dict[str, Any]]:
+        """
+        Semantic search for relevant opportunities.
+        Uses vector similarity to find opportunities
+        matching natural language descriptions.
+        Args:
+            query: Natural language query
+            sources: Data sources to search
+            top_k: Number of results to return
+        Returns:
+            List of matching opportunities with scores
+        TODO: Implement embedding-based search
+        """
+        logger.info(f"Semantic search: {query[:50]}...")
+        # Placeholder - would use embedding similarity
+        return []
+    async def get_scan_history(
+        self,
+        limit: int = 10,
+    ) -> List[ScanResult]:
+        """
+        Get history of recent scans.
+        Args:
+            limit: Maximum number of results
+        Returns:
+            List of recent scan results
+        TODO: Implement scan history retrieval
+        """
+        logger.info(f"Getting scan history (limit: {limit})")
+        # Placeholder
+        return []

src/workflow/langgraph_state.py CHANGED Viewed

@@ -13,12 +13,29 @@ from langgraph.graph.message import add_messages
 class ScenarioType(str, Enum):
     """
-    VISTA scenario types.
-    Each scenario has a dedicated multi-agent workflow.
     """
     PATENT_WAKEUP = "patent_wakeup"  # Scenario 1: Dormant IP valorization
     AGREEMENT_SAFETY = "agreement_safety"  # Scenario 2: Legal agreement review
     PARTNER_MATCHING = "partner_matching"  # Scenario 5: Stakeholder matching
     GENERAL = "general"  # Custom/general purpose tasks
@@ -461,3 +478,217 @@ class ValorizationBrief(BaseModel):
     # Metadata
     generated_date: str = Field(..., description="Generation date")
     version: str = Field(default="1.0", description="Document version")

 class ScenarioType(str, Enum):
     """
+    VISTA/Horizon EU scenario types for Technology Transfer Office (TTO) automation.
+    Each scenario has a dedicated multi-agent workflow aligned with TTO operations.
+    Coverage Status:
+    - FULLY COVERED (3): Patent Wake-Up, Agreement Safety, Partner Matching
+    - PARTIALLY COVERED (5): License Compliance, Award Identification, IP Portfolio, Due Diligence, Reporting
+    - NOT COVERED (2): Grant Writing, Negotiation Support
     """
+    # Fully Implemented Scenarios
     PATENT_WAKEUP = "patent_wakeup"  # Scenario 1: Dormant IP valorization
     AGREEMENT_SAFETY = "agreement_safety"  # Scenario 2: Legal agreement review
     PARTNER_MATCHING = "partner_matching"  # Scenario 5: Stakeholder matching
+    # New Scenarios (Placeholder - Partially Implemented)
+    LICENSE_COMPLIANCE = "license_compliance"  # Scenario 3: License tracking & compliance
+    AWARD_IDENTIFICATION = "award_identification"  # Scenario 4: Funding & award opportunities
+    # Future Scenarios (Not Yet Implemented)
+    IP_PORTFOLIO = "ip_portfolio"  # IP portfolio management
+    DUE_DILIGENCE = "due_diligence"  # Technology due diligence
+    REPORTING = "reporting"  # TTO metrics and reporting
+    # General Purpose
     GENERAL = "general"  # Custom/general purpose tasks
     # Metadata
     generated_date: str = Field(..., description="Generation date")
     version: str = Field(default="1.0", description="Document version")
+# ============================================================================
+# License Compliance Monitoring Models (Scenario 3)
+# ============================================================================
+class ComplianceStatus(str, Enum):
+    """License compliance status for monitoring."""
+    COMPLIANT = "compliant"
+    NON_COMPLIANT = "non_compliant"
+    AT_RISK = "at_risk"
+    PENDING_REVIEW = "pending_review"
+    EXPIRED = "expired"
+class LicenseComplianceAnalysis(BaseModel):
+    """
+    License compliance analysis output from LicenseComplianceAgent.
+    GDPR Note: This model may contain references to personal data
+    (licensee contacts, payment info). Implement appropriate access
+    controls and data retention policies.
+    """
+    license_id: str = Field(..., description="License agreement identifier")
+    agreement_name: str = Field(..., description="Name of the agreement")
+    licensee: str = Field(..., description="Licensee organization name")
+    # Compliance status
+    overall_status: ComplianceStatus = Field(..., description="Overall compliance status")
+    compliance_score: float = Field(..., ge=0.0, le=1.0, description="Compliance score 0-1")
+    # Payment compliance
+    payments_current: bool = Field(..., description="All payments up to date")
+    payments_overdue: int = Field(default=0, description="Number of overdue payments")
+    total_outstanding: float = Field(default=0.0, description="Total outstanding amount")
+    currency: str = Field(default="EUR", description="Currency code")
+    # Milestone compliance
+    milestones_on_track: bool = Field(..., description="All milestones on track")
+    milestones_overdue: int = Field(default=0, description="Number of overdue milestones")
+    next_milestone_date: Optional[str] = Field(None, description="Next milestone due date")
+    # Alerts and issues
+    active_alerts: List[str] = Field(default_factory=list, description="Active compliance alerts")
+    issues_identified: List[str] = Field(default_factory=list, description="Identified issues")
+    recommendations: List[str] = Field(default_factory=list, description="Compliance recommendations")
+    # Confidence and validation
+    confidence_score: float = Field(..., ge=0.0, le=1.0, description="Analysis confidence")
+    human_review_required: bool = Field(default=False, description="Requires human review")
+    last_reviewed: Optional[str] = Field(None, description="Last human review date")
+class RevenueReport(BaseModel):
+    """Revenue report for license portfolio."""
+    report_id: str = Field(..., description="Report identifier")
+    period_start: str = Field(..., description="Reporting period start")
+    period_end: str = Field(..., description="Reporting period end")
+    # Revenue summary
+    total_revenue: float = Field(..., description="Total revenue in period")
+    currency: str = Field(default="EUR", description="Currency code")
+    by_license: Dict[str, float] = Field(default_factory=dict, description="Revenue by license")
+    by_type: Dict[str, float] = Field(default_factory=dict, description="Revenue by type")
+    # Comparisons
+    vs_previous_period: Optional[float] = Field(None, description="% change vs previous period")
+    vs_forecast: Optional[float] = Field(None, description="% vs forecast")
+    # Analysis quality
+    confidence_score: float = Field(..., ge=0.0, le=1.0, description="Report confidence")
+# ============================================================================
+# Award Identification Models (Scenario 4)
+# ============================================================================
+class FundingOpportunity(BaseModel):
+    """
+    Funding opportunity identified by the award scanning system.
+    Represents grants, awards, and other funding opportunities
+    matched to research capabilities.
+    """
+    opportunity_id: str = Field(..., description="Opportunity identifier")
+    title: str = Field(..., description="Opportunity title")
+    description: str = Field(..., description="Full description")
+    # Funder information
+    funder: str = Field(..., description="Funding organization name")
+    funder_type: str = Field(..., description="Type: government, EU, foundation, corporate")
+    program_name: Optional[str] = Field(None, description="Funding program name")
+    # Funding details
+    amount_min: Optional[float] = Field(None, description="Minimum funding amount")
+    amount_max: Optional[float] = Field(None, description="Maximum funding amount")
+    currency: str = Field(default="EUR", description="Currency code")
+    funding_type: str = Field(..., description="Type: grant, award, prize, fellowship")
+    # Timing
+    deadline: Optional[str] = Field(None, description="Application deadline")
+    duration_months: Optional[int] = Field(None, description="Funding duration in months")
+    decision_date: Optional[str] = Field(None, description="Expected decision date")
+    # Matching
+    match_score: float = Field(..., ge=0.0, le=1.0, description="Match score with capabilities")
+    match_rationale: str = Field(..., description="Why this is a good match")
+    eligibility_status: str = Field(..., description="eligible, ineligible, partial, unknown")
+    eligibility_notes: List[str] = Field(default_factory=list, description="Eligibility details")
+    # Next steps
+    recommended_action: str = Field(..., description="Recommended next step")
+    application_effort: str = Field(..., description="Low, Medium, High effort required")
+    success_likelihood: str = Field(..., description="Low, Medium, High likelihood")
+    # Metadata
+    url: Optional[str] = Field(None, description="Opportunity URL")
+    keywords: List[str] = Field(default_factory=list, description="Relevant keywords")
+    research_areas: List[str] = Field(default_factory=list, description="Matching research areas")
+    discovered_date: str = Field(..., description="When opportunity was discovered")
+    # Quality
+    confidence_score: float = Field(..., ge=0.0, le=1.0, description="Analysis confidence")
+class AwardApplicationStatus(BaseModel):
+    """Status tracking for award/grant applications."""
+    application_id: str = Field(..., description="Application identifier")
+    opportunity_id: str = Field(..., description="Target opportunity")
+    # Status
+    status: str = Field(..., description="draft, internal_review, submitted, under_review, awarded, rejected")
+    submitted_date: Optional[str] = Field(None, description="Submission date")
+    decision_date: Optional[str] = Field(None, description="Decision received date")
+    # Documents
+    documents_completed: int = Field(default=0, description="Completed documents")
+    documents_required: int = Field(default=0, description="Total required documents")
+    documents_pending_review: int = Field(default=0, description="Documents pending review")
+    # Quality
+    overall_score: Optional[float] = Field(None, ge=0.0, le=1.0, description="Application quality score")
+    critic_validation: Optional[Dict[str, Any]] = Field(None, description="CriticAgent validation result")
+    human_approved: bool = Field(default=False, description="Human approval received")
+    # Notes
+    internal_notes: List[str] = Field(default_factory=list, description="Internal notes")
+    feedback: Optional[str] = Field(None, description="Feedback from funder if received")
+# ============================================================================
+# Human-in-the-Loop Decision Models
+# ============================================================================
+class HumanDecisionPoint(BaseModel):
+    """
+    Human-in-the-loop decision point for workflow orchestration.
+    Captures when and why human input is required, and tracks
+    the decision made.
+    """
+    decision_id: str = Field(..., description="Decision point identifier")
+    workflow_id: str = Field(..., description="Parent workflow ID")
+    scenario: ScenarioType = Field(..., description="Scenario requiring decision")
+    # Decision context
+    decision_type: str = Field(..., description="Type: approval, selection, verification, override")
+    question: str = Field(..., description="Decision question for human")
+    context: str = Field(..., description="Context and background for decision")
+    options: List[str] = Field(default_factory=list, description="Available options")
+    # AI recommendation
+    ai_recommendation: Optional[str] = Field(None, description="AI recommended option")
+    ai_confidence: Optional[float] = Field(None, ge=0.0, le=1.0, description="AI confidence in recommendation")
+    ai_rationale: Optional[str] = Field(None, description="Rationale for AI recommendation")
+    # Human decision
+    human_decision: Optional[str] = Field(None, description="Human selected option")
+    human_rationale: Optional[str] = Field(None, description="Human provided rationale")
+    decided_by: Optional[str] = Field(None, description="User who made decision")
+    decided_at: Optional[str] = Field(None, description="Timestamp of decision")
+    # Status
+    status: str = Field(default="pending", description="pending, decided, expired, skipped")
+    expires_at: Optional[str] = Field(None, description="When decision times out")
+    # Audit
+    created_at: str = Field(..., description="When decision point was created")
+class SourceVerification(BaseModel):
+    """
+    Source verification for hallucination mitigation.
+    Tracks sources used by AI agents and their verification status.
+    """
+    verification_id: str = Field(..., description="Verification identifier")
+    claim: str = Field(..., description="AI-generated claim to verify")
+    # Sources
+    sources: List[Dict[str, Any]] = Field(default_factory=list, description="Supporting sources")
+    source_count: int = Field(default=0, description="Number of sources found")
+    # Verification
+    verified: bool = Field(..., description="Claim is verified by sources")
+    verification_score: float = Field(..., ge=0.0, le=1.0, description="Verification confidence")
+    verification_method: str = Field(..., description="How verification was performed")
+    # Issues
+    discrepancies: List[str] = Field(default_factory=list, description="Discrepancies found")
+    warnings: List[str] = Field(default_factory=list, description="Verification warnings")
+    # Metadata
+    verified_at: str = Field(..., description="When verification was performed")