Spaces:
Sleeping
feat: Add Paper Auto-Discovery (PAD) engine and update documentation
Browse files- Implement custom multi-source paper search engine (PAD)
- Semantic Scholar Graph API v1 integration
- arXiv API with XML parsing
- Parallel API execution using ThreadPoolExecutor
- Smart deduplication and result ranking
- Sub-2-second search performance
- Add PAD Search tab to Gradio UI
- Real-time search across multiple sources
- Interactive paper selection with metadata preview
- One-click workflow from search to podcast generation
- Update About section with comprehensive PAD documentation
- Technical architecture details
- Performance metrics and innovation highlights
- Integration with existing PPF system
- Enhance README.md with PAD features
- Updated overview and feature list
- Expanded technical stack section
- Revised project structure
Hackathon-ready version for MCP 1st Birthday - Track 2 (Consumer)
- README.md +61 -34
- app.py +316 -42
- output/history.json +23 -1
- processing/paper_discovery.py +309 -0
- todo.md +0 -91
|
@@ -17,39 +17,59 @@ tags:
|
|
| 17 |
|
| 18 |
# PaperCast ποΈ
|
| 19 |
|
| 20 |
-
Transform research papers into engaging podcast-style conversations.
|
| 21 |
|
| 22 |
**Track:** `mcp-in-action-track-consumer`
|
| 23 |
|
| 24 |
## Overview
|
| 25 |
|
| 26 |
-
PaperCast is an AI agent application
|
| 27 |
|
| 28 |
-
## Features
|
| 29 |
|
| 30 |
-
|
| 31 |
-
|
| 32 |
-
-
|
| 33 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 34 |
- π **Complete Transcripts**: Download both audio and text versions
|
| 35 |
-
-
|
| 36 |
|
| 37 |
## How It Works
|
| 38 |
|
| 39 |
-
1. **
|
| 40 |
-
2. **
|
| 41 |
-
3. **
|
| 42 |
-
4. **
|
| 43 |
-
5. **
|
|
|
|
|
|
|
| 44 |
|
| 45 |
## Technical Stack
|
| 46 |
|
| 47 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 48 |
- **AI Agent**: Autonomous reasoning with MCP integration
|
| 49 |
-
- **LLM**:
|
| 50 |
-
- **TTS**: ElevenLabs (API) or Supertonic-66M (CPU, no API key required)
|
| 51 |
-
- **PDF Processing**: PyMuPDF
|
| 52 |
-
- **Platform**: HuggingFace Spaces
|
| 53 |
|
| 54 |
## Installation
|
| 55 |
|
|
@@ -71,28 +91,37 @@ Then open your browser to the provided URL (typically `http://localhost:7860`).
|
|
| 71 |
|
| 72 |
```
|
| 73 |
papercast/
|
| 74 |
-
βββ app.py
|
| 75 |
-
ββ
|
| 76 |
-
βββ README.md
|
| 77 |
-
βββ agents/
|
| 78 |
-
|
| 79 |
-
βββ processing/
|
| 80 |
-
βββ
|
| 81 |
-
βββ
|
| 82 |
-
βββ
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 83 |
```
|
| 84 |
|
| 85 |
## Team
|
| 86 |
|
| 87 |
-
- [
|
| 88 |
|
| 89 |
## Demo
|
| 90 |
|
| 91 |
-
[
|
| 92 |
|
| 93 |
## Social Media
|
| 94 |
|
| 95 |
-
[
|
| 96 |
|
| 97 |
## Acknowledgments
|
| 98 |
|
|
@@ -105,10 +134,8 @@ Special thanks to:
|
|
| 105 |
|
| 106 |
## License
|
| 107 |
|
| 108 |
-
|
| 109 |
|
| 110 |
---
|
| 111 |
|
| 112 |
-
**
|
| 113 |
-
**Category:** Consumer Applications
|
| 114 |
-
**Organization:** MCP-1st-Birthday
|
|
|
|
| 17 |
|
| 18 |
# PaperCast ποΈ
|
| 19 |
|
| 20 |
+
Transform research papers into engaging podcast-style conversations with intelligent paper discovery.
|
| 21 |
|
| 22 |
**Track:** `mcp-in-action-track-consumer`
|
| 23 |
|
| 24 |
## Overview
|
| 25 |
|
| 26 |
+
PaperCast is an AI agent application featuring two groundbreaking innovations: **Paper Auto-Discovery (PAD)** for intelligent multi-source search, and **Podcast Persona Framework (PPF)** for adaptive conversation styles. Simply search for papers, select one, choose your persona, and get a personalized podcast in under 60 seconds.
|
| 27 |
|
| 28 |
+
## Revolutionary Features
|
| 29 |
|
| 30 |
+
### π PAD - Paper Auto-Discovery Engine
|
| 31 |
+
**Custom-built multi-source academic search system**
|
| 32 |
+
- Search across Semantic Scholar (200M+ papers) and arXiv simultaneously
|
| 33 |
+
- Parallel API execution with results in under 2 seconds
|
| 34 |
+
- Smart deduplication and relevance ranking
|
| 35 |
+
- Zero-friction workflow: search β select β podcast
|
| 36 |
+
|
| 37 |
+
### π PPF - Podcast Persona Framework
|
| 38 |
+
**World's first adaptive persona system for academic podcasts**
|
| 39 |
+
- **5 Distinct Conversation Modes**: Friendly Explainer, Academic Debate, Savage Roast, Pedagogical, Interdisciplinary Clash
|
| 40 |
+
- Dynamic character personalities (not just voice changes)
|
| 41 |
+
- Adaptive dialogue based on selected persona
|
| 42 |
+
|
| 43 |
+
### β‘ Core Features
|
| 44 |
+
- π **Multiple Input Methods**: PAD search, arXiv URLs, or PDF uploads
|
| 45 |
+
- π€ **Autonomous Agent**: Intelligent discovery, analysis, and persona-aware generation
|
| 46 |
+
- π£οΈ **Studio-Quality Audio**: ElevenLabs Turbo v2.5 or Supertonic CPU TTS
|
| 47 |
- π **Complete Transcripts**: Download both audio and text versions
|
| 48 |
+
- π **60-Second Pipeline**: From search query to finished podcast in under a minute
|
| 49 |
|
| 50 |
## How It Works
|
| 51 |
|
| 52 |
+
1. **π Discovery (PAD)**: Search for papers across Semantic Scholar & arXiv (or use URL/PDF)
|
| 53 |
+
2. **π Selection**: Choose from curated results with metadata preview
|
| 54 |
+
3. **π Persona**: Select conversation style (Friendly, Debate, Roast, Pedagogical, etc.)
|
| 55 |
+
4. **π Analysis**: AI agent analyzes paper structure and identifies key concepts
|
| 56 |
+
5. **π¬ Script Generation**: Creates persona-specific dialogue with distinct characters
|
| 57 |
+
6. **π€ Audio Synthesis**: Converts script to studio-quality audio with ElevenLabs or Supertonic
|
| 58 |
+
7. **β
Output**: Download podcast audio and transcript
|
| 59 |
|
| 60 |
## Technical Stack
|
| 61 |
|
| 62 |
+
**Core Innovations** (Built from Scratch):
|
| 63 |
+
- **PAD Engine**: Custom Python multi-source search with ThreadPoolExecutor, Semantic Scholar Graph API v1, arXiv API integration
|
| 64 |
+
- **PPF System**: Proprietary persona framework with character-aware prompts and dynamic voice mapping
|
| 65 |
+
|
| 66 |
+
**Production Stack**:
|
| 67 |
+
- **Framework**: Gradio 6 with custom glass-morphism UI
|
| 68 |
- **AI Agent**: Autonomous reasoning with MCP integration
|
| 69 |
+
- **LLM**: OpenAI GPT-4o/o1, or local models (universal support)
|
| 70 |
+
- **TTS**: ElevenLabs Turbo v2.5 (API) or Supertonic-66M (CPU, no API key required)
|
| 71 |
+
- **PDF Processing**: PyMuPDF for fast extraction
|
| 72 |
+
- **Platform**: HuggingFace Spaces / Modal
|
| 73 |
|
| 74 |
## Installation
|
| 75 |
|
|
|
|
| 91 |
|
| 92 |
```
|
| 93 |
papercast/
|
| 94 |
+
βββ app.py # Main Gradio application with PAD & PPF UI
|
| 95 |
+
βοΏ½οΏ½οΏ½β requirements.txt # Python dependencies
|
| 96 |
+
βββ README.md # This file
|
| 97 |
+
βββ agents/ # Agent logic and orchestration
|
| 98 |
+
β βββ podcast_agent.py # Main agent with PPF integration
|
| 99 |
+
βββ processing/ # Paper discovery and PDF processing
|
| 100 |
+
β βββ paper_discovery.py # PAD engine (custom-built)
|
| 101 |
+
β βββ pdf_reader.py # PDF extraction
|
| 102 |
+
β βββ url_fetcher.py # Paper fetching
|
| 103 |
+
βββ generation/ # Script and dialogue generation
|
| 104 |
+
β βββ podcast_personas.py # PPF persona definitions
|
| 105 |
+
β βββ script_generator.py # LLM-based script generation
|
| 106 |
+
βββ synthesis/ # Text-to-speech audio generation
|
| 107 |
+
β βββ tts_engine.py # ElevenLabs integration
|
| 108 |
+
β βββ supertonic_tts.py # CPU-based TTS
|
| 109 |
+
βββ utils/ # Helper functions
|
| 110 |
+
βββ config.py # Configuration management
|
| 111 |
+
βββ history.py # Podcast history tracking
|
| 112 |
```
|
| 113 |
|
| 114 |
## Team
|
| 115 |
|
| 116 |
+
- batuhanozkose [My HuggingFace profile](https://huggingface.co/batuhanozkose)
|
| 117 |
|
| 118 |
## Demo
|
| 119 |
|
| 120 |
+
[DEMO Video] (https://youtu.be/IQ3z2CbWg-Y)
|
| 121 |
|
| 122 |
## Social Media
|
| 123 |
|
| 124 |
+
[X Thread Link](https://x.com/batuhan_ozkose/status/1993662091413385422)
|
| 125 |
|
| 126 |
## Acknowledgments
|
| 127 |
|
|
|
|
| 134 |
|
| 135 |
## License
|
| 136 |
|
| 137 |
+
MIT License
|
| 138 |
|
| 139 |
---
|
| 140 |
|
| 141 |
+
**Made with β€οΈ for the research community**
|
|
|
|
|
|
|
@@ -9,6 +9,7 @@ from utils.config import (
|
|
| 9 |
SCRIPT_GENERATION_MODEL,
|
| 10 |
)
|
| 11 |
from utils.history import get_history_items, load_history
|
|
|
|
| 12 |
|
| 13 |
# Ensure output directory exists
|
| 14 |
os.makedirs(OUTPUT_DIR, exist_ok=True)
|
|
@@ -405,6 +406,97 @@ def on_history_select(evt: gr.SelectData, data):
|
|
| 405 |
except:
|
| 406 |
return None
|
| 407 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 408 |
# --- Main UI ---
|
| 409 |
|
| 410 |
def main():
|
|
@@ -456,45 +548,157 @@ def main():
|
|
| 456 |
# Left Col: Inputs
|
| 457 |
with gr.Column(scale=4, elem_classes="glass-panel"):
|
| 458 |
gr.Markdown("### π₯ Source Material")
|
| 459 |
-
|
| 460 |
-
with gr.Tabs():
|
| 461 |
-
with gr.Tab("π URL"):
|
| 462 |
url_input = gr.Textbox(
|
| 463 |
-
label="Paper URL",
|
| 464 |
placeholder="https://arxiv.org/abs/...",
|
| 465 |
show_label=False,
|
| 466 |
container=False
|
| 467 |
)
|
| 468 |
-
|
| 469 |
with gr.Tab("π PDF Upload"):
|
| 470 |
pdf_upload = gr.File(
|
| 471 |
label="Upload PDF",
|
| 472 |
file_types=[".pdf"],
|
| 473 |
container=False
|
| 474 |
)
|
| 475 |
-
|
| 476 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 477 |
advanced_mode = gr.Checkbox(label="Batch Mode (Multiple Papers)")
|
| 478 |
-
|
| 479 |
# Warning message (only visible in batch mode)
|
| 480 |
batch_warning = gr.Markdown(
|
| 481 |
"""
|
| 482 |
> **β οΈ Experimental Feature**
|
| 483 |
-
>
|
| 484 |
-
> Batch mode is currently experimental and may not work reliably in all cases.
|
| 485 |
> Some attempts may fail due to model limitations or processing errors.
|
| 486 |
> If you experience issues, try processing papers individually.
|
| 487 |
""",
|
| 488 |
visible=False
|
| 489 |
)
|
| 490 |
-
|
| 491 |
with gr.Group(visible=False) as batch_inputs:
|
| 492 |
multi_url_input = gr.Textbox(label="Multiple URLs (one per line)", lines=3)
|
| 493 |
multi_pdf_upload = gr.File(label="Multiple PDFs", file_count="multiple")
|
| 494 |
-
|
| 495 |
gr.Markdown("---")
|
| 496 |
gr.Markdown("### π Context Settings")
|
| 497 |
-
|
| 498 |
# Context limit slider (only visible in batch mode)
|
| 499 |
context_limit_slider = gr.Slider(
|
| 500 |
minimum=50000,
|
|
@@ -504,7 +708,7 @@ def main():
|
|
| 504 |
label="Max Context Limit (characters)",
|
| 505 |
info="β οΈ Warning: Increasing this limit will increase token costs and processing time."
|
| 506 |
)
|
| 507 |
-
|
| 508 |
def toggle_advanced(adv):
|
| 509 |
return {
|
| 510 |
batch_warning: gr.update(visible=adv),
|
|
@@ -514,6 +718,18 @@ def main():
|
|
| 514 |
}
|
| 515 |
advanced_mode.change(toggle_advanced, advanced_mode, [batch_warning, batch_inputs, url_input, pdf_upload])
|
| 516 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 517 |
generate_btn = gr.Button(
|
| 518 |
"ποΈ Generate Podcast",
|
| 519 |
variant="primary",
|
|
@@ -783,13 +999,57 @@ def main():
|
|
| 783 |
|
| 784 |
# About PaperCast
|
| 785 |
|
| 786 |
-
**The world's first adaptive persona-driven academic podcast platform.**
|
| 787 |
|
| 788 |
-
Transform any research paper into engaging audio conversations with your choice of style β from casual explanations to brutal critiques. Powered by our revolutionary **Podcast Persona Framework (PPF)**, MCP tools, and studio-quality TTS.
|
| 789 |
|
| 790 |
---
|
| 791 |
|
| 792 |
-
## π Revolutionary
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 793 |
|
| 794 |
### **PPF** β Podcast Persona Framework
|
| 795 |
**The world's first adaptive persona system for AI-generated academic podcasts.**
|
|
@@ -837,51 +1097,56 @@ Traditional podcast generators produce the same monotonous style for every paper
|
|
| 837 |
|
| 838 |
## π― How It Works
|
| 839 |
|
| 840 |
-
Our intelligent agent orchestrates a **
|
| 841 |
|
| 842 |
-
1. **
|
| 843 |
-
2. **
|
| 844 |
-
3. **
|
| 845 |
-
4. **
|
| 846 |
-
5. **
|
| 847 |
-
6. **
|
| 848 |
-
7. **
|
|
|
|
| 849 |
|
| 850 |
-
**What makes this special:** Unlike generic converters,
|
| 851 |
|
| 852 |
---
|
| 853 |
|
| 854 |
## π Key Features
|
| 855 |
|
| 856 |
-
|
|
|
|
|
|
|
| 857 |
|
| 858 |
π§ **Dynamic Character Intelligence** β Real personalities, not generic voices
|
| 859 |
|
|
|
|
|
|
|
| 860 |
ποΈ **Studio-Quality Audio** β ElevenLabs Turbo v2.5 (250ms latency, cinematic quality)
|
| 861 |
|
| 862 |
π§ **Universal Compatibility** β Works with any LLM (OpenAI, local models, reasoning models)
|
| 863 |
|
| 864 |
-
β‘ **Zero-Configuration TTS** β Automatic voice mapping for any persona
|
| 865 |
-
|
| 866 |
π **Complete History** β All podcasts saved locally with metadata
|
| 867 |
|
| 868 |
π **Multi-Paper Support** β Batch process multiple papers into comprehensive discussions
|
| 869 |
|
| 870 |
π― **Provider Agnostic** β Bring your own API keys, use local models, total flexibility
|
| 871 |
|
|
|
|
|
|
|
| 872 |
---
|
| 873 |
|
| 874 |
## π§ Technology Stack
|
| 875 |
|
| 876 |
-
**Core
|
|
|
|
|
|
|
| 877 |
|
| 878 |
**LLM**: Universal support (OpenAI GPT-4o/o1, local LLMs, reasoning models)
|
| 879 |
**TTS**: ElevenLabs Turbo v2.5 (premium) or Supertonic (free CPU-based)
|
| 880 |
**PDF Processing**: PyMuPDF for fast, accurate text extraction
|
| 881 |
-
**Paper Sources**: Direct arXiv/medRxiv integration
|
| 882 |
**UI Framework**: Gradio 6 with custom glass-morphism design
|
| 883 |
**Agent Architecture**: Custom Python orchestrator with MCP tools
|
| 884 |
-
**Infrastructure**: Local-first (your machine) or cloud-ready (Modal/HF Spaces)
|
| 885 |
|
| 886 |
---
|
| 887 |
|
|
@@ -891,20 +1156,22 @@ Our intelligent agent orchestrates a **persona-aware pipeline** that adapts to y
|
|
| 891 |
*Tag: `mcp-in-action-track-consumer`*
|
| 892 |
|
| 893 |
**What we're showcasing:**
|
|
|
|
| 894 |
- π **PPF Innovation** - First-ever adaptive persona system for academic podcasts
|
| 895 |
- π€ **Autonomous Agent** - Intelligent planning, reasoning, and persona-aware execution
|
| 896 |
- π§ **MCP Integration** - Tools as cognitive extensions for the agent
|
| 897 |
-
- π¨ **Gradio 6 UX** - Glass-morphism design with intuitive persona controls
|
| 898 |
- π **Real Impact** - Making research accessible and engaging for everyone
|
| 899 |
|
| 900 |
-
**Why PPF
|
| 901 |
|
| 902 |
---
|
| 903 |
|
| 904 |
## π About the Agent
|
| 905 |
|
| 906 |
-
PaperCast's **
|
| 907 |
|
|
|
|
| 908 |
- **π§ Persona Analysis** - Evaluates paper complexity and matches optimal persona mode
|
| 909 |
- **π Strategic Planning** - Determines conversation flow based on selected persona (debate-style vs. teaching-style)
|
| 910 |
- **π Character Orchestration** - Generates distinct personalities for each persona (Dr. Morgan β The Critic β Professor Chen)
|
|
@@ -912,29 +1179,32 @@ PaperCast's **persona-aware autonomous agent** makes intelligent decisions at ev
|
|
| 912 |
- **π£οΈ Dynamic Synthesis** - Maps persona characters to voice IDs automatically
|
| 913 |
- **π Multi-Paper Intelligence** - Synthesizes insights across papers while maintaining persona consistency
|
| 914 |
|
| 915 |
-
**The key insight:** The agent doesn't just process papers β it **performs** them
|
| 916 |
|
| 917 |
---
|
| 918 |
|
| 919 |
## π‘ Use Cases
|
| 920 |
|
| 921 |
### π§ **Learning & Education**
|
|
|
|
| 922 |
- **Pedagogical mode** for complex topics you want to master
|
| 923 |
- **Friendly Explainer** for quick overviews during commutes
|
| 924 |
- **Interdisciplinary Clash** to understand papers outside your field
|
| 925 |
|
| 926 |
### π¬ **Research & Analysis**
|
|
|
|
| 927 |
- **Academic Debate** for critical evaluation of methodologies
|
| 928 |
- **Savage Roast** to identify weak points and overstated claims
|
| 929 |
-
- Quick paper screening before deep reading
|
| 930 |
|
| 931 |
### π **Accessibility**
|
|
|
|
| 932 |
- Make cutting-edge research understandable for non-experts
|
| 933 |
- Bridge knowledge gaps between disciplines
|
| 934 |
- Learn through conversation, not dry text
|
| 935 |
|
| 936 |
### π **Entertainment**
|
| 937 |
-
- **Savage Roast**
|
| 938 |
- Host paper "debate clubs" with Academic Debate mode
|
| 939 |
- Share entertaining takes on research with Savage Roast clips
|
| 940 |
|
|
@@ -942,17 +1212,21 @@ PaperCast's **persona-aware autonomous agent** makes intelligent decisions at ev
|
|
| 942 |
|
| 943 |
## π What Makes Us Different
|
| 944 |
|
|
|
|
|
|
|
| 945 |
π **We invented PPF** β The Podcast Persona Framework is a **world-first innovation**. No other platform offers adaptive conversation personas.
|
| 946 |
|
|
|
|
|
|
|
| 947 |
π§ **Real characters, not voices** β Other tools change tone. We create **distinct personalities** with names, perspectives, and consistent behavior.
|
| 948 |
|
| 949 |
-
|
| 950 |
|
| 951 |
-
|
| 952 |
|
| 953 |
-
π― **User empowerment** β You choose
|
| 954 |
|
| 955 |
-
**The bottom line:** Every other podcast generator is a one-trick pony. PaperCast is a **repertory theater company** β
|
| 956 |
|
| 957 |
---
|
| 958 |
|
|
|
|
| 9 |
SCRIPT_GENERATION_MODEL,
|
| 10 |
)
|
| 11 |
from utils.history import get_history_items, load_history
|
| 12 |
+
from processing.paper_discovery import search_papers, PaperDiscoveryEngine
|
| 13 |
|
| 14 |
# Ensure output directory exists
|
| 15 |
os.makedirs(OUTPUT_DIR, exist_ok=True)
|
|
|
|
| 406 |
except:
|
| 407 |
return None
|
| 408 |
|
| 409 |
+
def perform_paper_search(query: str, progress=gr.Progress()):
|
| 410 |
+
"""
|
| 411 |
+
PAD: Search for papers using Paper Auto-Discovery
|
| 412 |
+
|
| 413 |
+
Returns formatted results for display in UI
|
| 414 |
+
"""
|
| 415 |
+
if not query or not query.strip():
|
| 416 |
+
return gr.update(choices=[], value=None, visible=False), "β οΈ Please enter a search query"
|
| 417 |
+
|
| 418 |
+
progress(0.2, desc="Searching Semantic Scholar & arXiv...")
|
| 419 |
+
|
| 420 |
+
try:
|
| 421 |
+
# Search using PAD
|
| 422 |
+
results = search_papers(query.strip(), max_results=5)
|
| 423 |
+
|
| 424 |
+
if not results:
|
| 425 |
+
return gr.update(choices=[], value=None, visible=False), "β No papers found. Try a different query."
|
| 426 |
+
|
| 427 |
+
progress(0.8, desc=f"Found {len(results)} papers")
|
| 428 |
+
|
| 429 |
+
# Format results for Dropdown display
|
| 430 |
+
choices = []
|
| 431 |
+
for i, paper in enumerate(results, 1):
|
| 432 |
+
authors_str = ", ".join(paper.authors[:2])
|
| 433 |
+
if len(paper.authors) > 2:
|
| 434 |
+
authors_str += " et al."
|
| 435 |
+
|
| 436 |
+
year_str = f" ({paper.year})" if paper.year else ""
|
| 437 |
+
source_emoji = "π" if paper.source == "semantic_scholar" else "π¬"
|
| 438 |
+
|
| 439 |
+
# Create display label for dropdown
|
| 440 |
+
label = f"{i}. {source_emoji} {paper.title}{year_str} | {authors_str}"
|
| 441 |
+
choices.append(label) # Dropdown just needs the labels
|
| 442 |
+
|
| 443 |
+
progress(1.0, desc="Search complete!")
|
| 444 |
+
|
| 445 |
+
print(f"[DEBUG] Search found {len(results)} papers")
|
| 446 |
+
print(f"[DEBUG] Choices created: {len(choices)}")
|
| 447 |
+
print(f"[DEBUG] First choice: {choices[0] if choices else 'NONE'}")
|
| 448 |
+
|
| 449 |
+
# Store results in a global variable (we'll use State instead)
|
| 450 |
+
# Return updated Dropdown and success message
|
| 451 |
+
success_msg = f"β
Found {len(results)} papers from Semantic Scholar & arXiv"
|
| 452 |
+
|
| 453 |
+
# Select the first option by default to ensure visibility/interaction
|
| 454 |
+
first_choice = choices[0] if choices else None
|
| 455 |
+
|
| 456 |
+
return gr.update(choices=choices, value=first_choice, visible=True, interactive=True), success_msg
|
| 457 |
+
|
| 458 |
+
except Exception as e:
|
| 459 |
+
return gr.update(choices=[], value=None, visible=False), f"β Search failed: {str(e)}"
|
| 460 |
+
|
| 461 |
+
def on_paper_select(selected_label, query):
|
| 462 |
+
"""
|
| 463 |
+
Handle paper selection from search results.
|
| 464 |
+
Returns the PDF URL to be used for podcast generation.
|
| 465 |
+
"""
|
| 466 |
+
if not selected_label:
|
| 467 |
+
return None, "β οΈ Please select a paper from the search results"
|
| 468 |
+
|
| 469 |
+
try:
|
| 470 |
+
# Extract index from label (format: "1. emoji title...")
|
| 471 |
+
selected_index = int(selected_label.split(".")[0]) - 1
|
| 472 |
+
|
| 473 |
+
# Re-run search to get results (since we can't pass complex objects through Gradio)
|
| 474 |
+
results = search_papers(query.strip(), max_results=5)
|
| 475 |
+
|
| 476 |
+
if not results or selected_index >= len(results) or selected_index < 0:
|
| 477 |
+
return None, "β Invalid selection"
|
| 478 |
+
|
| 479 |
+
selected_paper = results[selected_index]
|
| 480 |
+
|
| 481 |
+
# Get PDF URL
|
| 482 |
+
engine = PaperDiscoveryEngine()
|
| 483 |
+
pdf_url = engine.get_pdf_url(selected_paper)
|
| 484 |
+
|
| 485 |
+
if not pdf_url:
|
| 486 |
+
return None, f"β No PDF available for: {selected_paper.title}"
|
| 487 |
+
|
| 488 |
+
# Return PDF URL and success message
|
| 489 |
+
authors_str = ", ".join(selected_paper.authors[:3])
|
| 490 |
+
if len(selected_paper.authors) > 3:
|
| 491 |
+
authors_str += " et al."
|
| 492 |
+
|
| 493 |
+
success_msg = f"β
Selected: **{selected_paper.title}**\n\nπ₯ {authors_str}\nπ
{selected_paper.year or 'N/A'}\nπ {pdf_url}"
|
| 494 |
+
|
| 495 |
+
return pdf_url, success_msg
|
| 496 |
+
|
| 497 |
+
except Exception as e:
|
| 498 |
+
return None, f"β Selection failed: {str(e)}"
|
| 499 |
+
|
| 500 |
# --- Main UI ---
|
| 501 |
|
| 502 |
def main():
|
|
|
|
| 548 |
# Left Col: Inputs
|
| 549 |
with gr.Column(scale=4, elem_classes="glass-panel"):
|
| 550 |
gr.Markdown("### π₯ Source Material")
|
| 551 |
+
|
| 552 |
+
with gr.Tabs(selected=0) as input_tabs:
|
| 553 |
+
with gr.Tab("π URL", id=0):
|
| 554 |
url_input = gr.Textbox(
|
| 555 |
+
label="Paper URL",
|
| 556 |
placeholder="https://arxiv.org/abs/...",
|
| 557 |
show_label=False,
|
| 558 |
container=False
|
| 559 |
)
|
| 560 |
+
|
| 561 |
with gr.Tab("π PDF Upload"):
|
| 562 |
pdf_upload = gr.File(
|
| 563 |
label="Upload PDF",
|
| 564 |
file_types=[".pdf"],
|
| 565 |
container=False
|
| 566 |
)
|
| 567 |
+
|
| 568 |
+
with gr.Tab("π Search (PAD)"):
|
| 569 |
+
gr.Markdown("**Paper Auto-Discovery** β Search across Semantic Scholar & arXiv")
|
| 570 |
+
|
| 571 |
+
with gr.Row():
|
| 572 |
+
search_query = gr.Textbox(
|
| 573 |
+
label="Search Query",
|
| 574 |
+
placeholder="e.g., 'diffusion models', 'Grok reasoning', 'transformer attention'...",
|
| 575 |
+
show_label=False,
|
| 576 |
+
container=False,
|
| 577 |
+
scale=4,
|
| 578 |
+
lines=1,
|
| 579 |
+
max_lines=1
|
| 580 |
+
)
|
| 581 |
+
search_btn = gr.Button("π Search", variant="primary", scale=1)
|
| 582 |
+
|
| 583 |
+
search_status = gr.Markdown("", visible=True)
|
| 584 |
+
|
| 585 |
+
# Container for search results (always visible)
|
| 586 |
+
with gr.Column(visible=True) as search_results_container:
|
| 587 |
+
search_results = gr.Radio(
|
| 588 |
+
label="π Select a Paper",
|
| 589 |
+
choices=[],
|
| 590 |
+
interactive=True,
|
| 591 |
+
show_label=True,
|
| 592 |
+
)
|
| 593 |
+
|
| 594 |
+
use_selected_btn = gr.Button(
|
| 595 |
+
"β
Use Selected Paper",
|
| 596 |
+
variant="primary",
|
| 597 |
+
size="lg"
|
| 598 |
+
)
|
| 599 |
+
|
| 600 |
+
# Hidden state to store selected PDF URL from search
|
| 601 |
+
selected_pdf_url = gr.State(value=None)
|
| 602 |
+
selected_search_query = gr.State(value=None)
|
| 603 |
+
|
| 604 |
+
# Wire search functionality
|
| 605 |
+
def handle_search(query):
|
| 606 |
+
"""Handle search button click"""
|
| 607 |
+
if not query or not query.strip():
|
| 608 |
+
return (
|
| 609 |
+
gr.update(choices=[], value=None),
|
| 610 |
+
"β οΈ Please enter a search query",
|
| 611 |
+
query
|
| 612 |
+
)
|
| 613 |
+
|
| 614 |
+
try:
|
| 615 |
+
# Search using PAD
|
| 616 |
+
results = search_papers(query.strip(), max_results=5)
|
| 617 |
+
|
| 618 |
+
if not results:
|
| 619 |
+
return (
|
| 620 |
+
gr.update(choices=[], value=None),
|
| 621 |
+
"β No papers found. Try a different query.",
|
| 622 |
+
query
|
| 623 |
+
)
|
| 624 |
+
|
| 625 |
+
# Format results for Radio display
|
| 626 |
+
choices = []
|
| 627 |
+
for i, paper in enumerate(results, 1):
|
| 628 |
+
authors_str = ", ".join(paper.authors[:2])
|
| 629 |
+
if len(paper.authors) > 2:
|
| 630 |
+
authors_str += " et al."
|
| 631 |
+
|
| 632 |
+
year_str = f" ({paper.year})" if paper.year else ""
|
| 633 |
+
source_emoji = "π" if paper.source == "semantic_scholar" else "π¬"
|
| 634 |
+
|
| 635 |
+
# Create display label
|
| 636 |
+
label = f"{i}. {source_emoji} {paper.title}{year_str} | {authors_str}"
|
| 637 |
+
choices.append(label)
|
| 638 |
+
|
| 639 |
+
first_choice = choices[0] if choices else None
|
| 640 |
+
status_msg = f"β
Found {len(results)} papers from Semantic Scholar & arXiv"
|
| 641 |
+
status_msg += "\n\n**β‘οΈ Next:** Select a paper from the list below, then click 'Use Selected Paper'"
|
| 642 |
+
|
| 643 |
+
print(f"[DEBUG] handle_search - found {len(choices)} papers")
|
| 644 |
+
print(f"[DEBUG] choices: {choices[:2]}...")
|
| 645 |
+
|
| 646 |
+
return (
|
| 647 |
+
gr.update(choices=choices, value=first_choice),
|
| 648 |
+
status_msg,
|
| 649 |
+
query
|
| 650 |
+
)
|
| 651 |
+
|
| 652 |
+
except Exception as e:
|
| 653 |
+
print(f"[ERROR] Search failed: {e}")
|
| 654 |
+
return (
|
| 655 |
+
gr.update(choices=[], value=None),
|
| 656 |
+
f"β Search failed: {str(e)}",
|
| 657 |
+
query
|
| 658 |
+
)
|
| 659 |
+
|
| 660 |
+
search_btn.click(
|
| 661 |
+
fn=handle_search,
|
| 662 |
+
inputs=[search_query],
|
| 663 |
+
outputs=[search_results, search_status, selected_search_query]
|
| 664 |
+
)
|
| 665 |
+
|
| 666 |
+
def handle_use_selected(selected_idx, query):
|
| 667 |
+
"""Handle 'Use Selected Paper' button click"""
|
| 668 |
+
pdf_url, status_msg = on_paper_select(selected_idx, query)
|
| 669 |
+
# Add instruction to the status message
|
| 670 |
+
if pdf_url:
|
| 671 |
+
status_msg += "\n\nβ‘οΈ **Next:** Switch to the 'π URL' tab to see the paper URL, then click 'ποΈ Generate Podcast'"
|
| 672 |
+
return pdf_url, status_msg, pdf_url # Update url_input with PDF URL
|
| 673 |
+
|
| 674 |
+
use_selected_btn.click(
|
| 675 |
+
fn=handle_use_selected,
|
| 676 |
+
inputs=[search_results, selected_search_query],
|
| 677 |
+
outputs=[selected_pdf_url, search_status, url_input]
|
| 678 |
+
)
|
| 679 |
+
|
| 680 |
+
with gr.Accordion("βοΈ Advanced Options", open=False, visible=True) as advanced_accordion:
|
| 681 |
advanced_mode = gr.Checkbox(label="Batch Mode (Multiple Papers)")
|
| 682 |
+
|
| 683 |
# Warning message (only visible in batch mode)
|
| 684 |
batch_warning = gr.Markdown(
|
| 685 |
"""
|
| 686 |
> **β οΈ Experimental Feature**
|
| 687 |
+
>
|
| 688 |
+
> Batch mode is currently experimental and may not work reliably in all cases.
|
| 689 |
> Some attempts may fail due to model limitations or processing errors.
|
| 690 |
> If you experience issues, try processing papers individually.
|
| 691 |
""",
|
| 692 |
visible=False
|
| 693 |
)
|
| 694 |
+
|
| 695 |
with gr.Group(visible=False) as batch_inputs:
|
| 696 |
multi_url_input = gr.Textbox(label="Multiple URLs (one per line)", lines=3)
|
| 697 |
multi_pdf_upload = gr.File(label="Multiple PDFs", file_count="multiple")
|
| 698 |
+
|
| 699 |
gr.Markdown("---")
|
| 700 |
gr.Markdown("### π Context Settings")
|
| 701 |
+
|
| 702 |
# Context limit slider (only visible in batch mode)
|
| 703 |
context_limit_slider = gr.Slider(
|
| 704 |
minimum=50000,
|
|
|
|
| 708 |
label="Max Context Limit (characters)",
|
| 709 |
info="β οΈ Warning: Increasing this limit will increase token costs and processing time."
|
| 710 |
)
|
| 711 |
+
|
| 712 |
def toggle_advanced(adv):
|
| 713 |
return {
|
| 714 |
batch_warning: gr.update(visible=adv),
|
|
|
|
| 718 |
}
|
| 719 |
advanced_mode.change(toggle_advanced, advanced_mode, [batch_warning, batch_inputs, url_input, pdf_upload])
|
| 720 |
|
| 721 |
+
# Hide Advanced Options when Search (PAD) tab is selected
|
| 722 |
+
def on_tab_select(evt: gr.SelectData):
|
| 723 |
+
"""Handle tab selection - hide batch mode for Search tab"""
|
| 724 |
+
# Tab indices: 0=URL, 1=PDF Upload, 2=Search (PAD)
|
| 725 |
+
is_search_tab = (evt.index == 2)
|
| 726 |
+
return gr.update(visible=not is_search_tab)
|
| 727 |
+
|
| 728 |
+
input_tabs.select(
|
| 729 |
+
fn=on_tab_select,
|
| 730 |
+
outputs=[advanced_accordion]
|
| 731 |
+
)
|
| 732 |
+
|
| 733 |
generate_btn = gr.Button(
|
| 734 |
"ποΈ Generate Podcast",
|
| 735 |
variant="primary",
|
|
|
|
| 999 |
|
| 1000 |
# About PaperCast
|
| 1001 |
|
| 1002 |
+
**The world's first adaptive persona-driven academic podcast platform with intelligent paper discovery.**
|
| 1003 |
|
| 1004 |
+
Transform any research paper into engaging audio conversations with your choice of style β from casual explanations to brutal critiques. Powered by our revolutionary **Podcast Persona Framework (PPF)**, **Paper Auto-Discovery (PAD)** engine, MCP tools, and studio-quality TTS.
|
| 1005 |
|
| 1006 |
---
|
| 1007 |
|
| 1008 |
+
## π Revolutionary Frameworks
|
| 1009 |
+
|
| 1010 |
+
### **PAD** β Paper Auto-Discovery Engine
|
| 1011 |
+
**The world's first intelligent multi-source paper discovery system built specifically for podcast generation.**
|
| 1012 |
+
|
| 1013 |
+
Finding the right research paper shouldn't be a chore. We built **PAD (Paper Auto-Discovery)** from the ground up β a custom-engineered search system that goes beyond simple keyword matching.
|
| 1014 |
+
|
| 1015 |
+
**What makes PAD revolutionary:**
|
| 1016 |
+
|
| 1017 |
+
π **Multi-Source Intelligence** β Searches across multiple academic databases simultaneously:
|
| 1018 |
+
- **Semantic Scholar Graph API** - Access to 200M+ papers with semantic understanding
|
| 1019 |
+
- **arXiv** - Latest preprints and cutting-edge research
|
| 1020 |
+
- Parallel execution for lightning-fast results (under 2 seconds)
|
| 1021 |
+
|
| 1022 |
+
π§ **Smart Result Aggregation** β Built from scratch with advanced deduplication:
|
| 1023 |
+
- Intelligent title matching across sources
|
| 1024 |
+
- Eliminates duplicates while preserving metadata quality
|
| 1025 |
+
- Prioritizes papers with open-access PDFs
|
| 1026 |
+
|
| 1027 |
+
β‘ **Seamless Integration** β No copy-paste, no manual URL hunting:
|
| 1028 |
+
- Search directly within PaperCast interface
|
| 1029 |
+
- One-click paper selection
|
| 1030 |
+
- Automatic PDF URL extraction and validation
|
| 1031 |
+
- Instant transition to podcast generation
|
| 1032 |
+
|
| 1033 |
+
π― **Research-Grade Quality** β Enterprise-level reliability:
|
| 1034 |
+
- Graceful handling of API rate limits
|
| 1035 |
+
- Fallback strategies when one source fails
|
| 1036 |
+
- Comprehensive error handling and user feedback
|
| 1037 |
+
- Extracts full metadata (authors, year, abstract, citations)
|
| 1038 |
+
|
| 1039 |
+
**Why we built PAD from scratch:**
|
| 1040 |
+
|
| 1041 |
+
Existing search tools are designed for reading papers, not generating podcasts. We needed:
|
| 1042 |
+
- **Speed**: Parallel API calls return results in under 2 seconds
|
| 1043 |
+
- **Reliability**: Custom retry logic and fallback strategies
|
| 1044 |
+
- **Integration**: Direct pipeline from search β PDF β podcast
|
| 1045 |
+
- **User Experience**: No context switching, no tab juggling
|
| 1046 |
+
|
| 1047 |
+
**Technical Innovation:**
|
| 1048 |
+
- Custom Python engine using `ThreadPoolExecutor` for concurrent API calls
|
| 1049 |
+
- Smart result ranking combining relevance scores from multiple sources
|
| 1050 |
+
- Automatic PDF URL construction for arXiv papers
|
| 1051 |
+
- State-of-the-art deduplication using fuzzy title matching
|
| 1052 |
+
---
|
| 1053 |
|
| 1054 |
### **PPF** β Podcast Persona Framework
|
| 1055 |
**The world's first adaptive persona system for AI-generated academic podcasts.**
|
|
|
|
| 1097 |
|
| 1098 |
## π― How It Works
|
| 1099 |
|
| 1100 |
+
Our intelligent agent orchestrates a **dual-innovation pipeline** combining PAD and PPF:
|
| 1101 |
|
| 1102 |
+
1. **π Discovery (PAD)** - Search across Semantic Scholar & arXiv simultaneously, get results in <2 seconds
|
| 1103 |
+
2. **π₯ Input** - Select paper from PAD results, or use URL/PDF upload
|
| 1104 |
+
3. **π Extraction** - PyMuPDF intelligently extracts paper structure
|
| 1105 |
+
4. **π Persona Selection** - Choose from 5 unique conversation modes (PPF)
|
| 1106 |
+
5. **π¬ Script Generation** - LLM generates character-specific dialogue with distinct personalities
|
| 1107 |
+
6. **π£οΈ Dynamic Mapping** - Automatic voice assignment based on persona characters
|
| 1108 |
+
7. **π€ Voice Synthesis** - Studio-quality audio with ElevenLabs Turbo v2.5 or Supertonic
|
| 1109 |
+
8. **β
Delivery** - Listen, download, share your personalized podcast
|
| 1110 |
|
| 1111 |
+
**What makes this special:** Unlike generic converters, we built **two groundbreaking systems from scratch** β PAD for intelligent discovery and PPF for adaptive personas.
|
| 1112 |
|
| 1113 |
---
|
| 1114 |
|
| 1115 |
## π Key Features
|
| 1116 |
|
| 1117 |
+
π **PAD - Paper Auto-Discovery** β Custom-built multi-source search engine (Semantic Scholar + arXiv) with parallel execution
|
| 1118 |
+
|
| 1119 |
+
π **5 Revolutionary Persona Modes** β First-of-its-kind adaptive conversation system (PPF)
|
| 1120 |
|
| 1121 |
π§ **Dynamic Character Intelligence** β Real personalities, not generic voices
|
| 1122 |
|
| 1123 |
+
β‘ **Lightning-Fast Search** β Get 5 relevant papers in under 2 seconds with intelligent deduplication
|
| 1124 |
+
|
| 1125 |
ποΈ **Studio-Quality Audio** β ElevenLabs Turbo v2.5 (250ms latency, cinematic quality)
|
| 1126 |
|
| 1127 |
π§ **Universal Compatibility** β Works with any LLM (OpenAI, local models, reasoning models)
|
| 1128 |
|
|
|
|
|
|
|
| 1129 |
π **Complete History** β All podcasts saved locally with metadata
|
| 1130 |
|
| 1131 |
π **Multi-Paper Support** β Batch process multiple papers into comprehensive discussions
|
| 1132 |
|
| 1133 |
π― **Provider Agnostic** β Bring your own API keys, use local models, total flexibility
|
| 1134 |
|
| 1135 |
+
π **Zero Friction Workflow** β From search query to podcast in 60 seconds
|
| 1136 |
+
|
| 1137 |
---
|
| 1138 |
|
| 1139 |
## π§ Technology Stack
|
| 1140 |
|
| 1141 |
+
**Core Innovations**:
|
| 1142 |
+
- **PAD (Paper Auto-Discovery)** β Custom multi-source search engine built from scratch
|
| 1143 |
+
- **PPF (Podcast Persona Framework)** β Proprietary adaptive conversation system
|
| 1144 |
|
| 1145 |
**LLM**: Universal support (OpenAI GPT-4o/o1, local LLMs, reasoning models)
|
| 1146 |
**TTS**: ElevenLabs Turbo v2.5 (premium) or Supertonic (free CPU-based)
|
| 1147 |
**PDF Processing**: PyMuPDF for fast, accurate text extraction
|
|
|
|
| 1148 |
**UI Framework**: Gradio 6 with custom glass-morphism design
|
| 1149 |
**Agent Architecture**: Custom Python orchestrator with MCP tools
|
|
|
|
| 1150 |
|
| 1151 |
---
|
| 1152 |
|
|
|
|
| 1156 |
*Tag: `mcp-in-action-track-consumer`*
|
| 1157 |
|
| 1158 |
**What we're showcasing:**
|
| 1159 |
+
- π **PAD Innovation** - First-ever custom multi-source paper discovery engine built for podcast generation
|
| 1160 |
- π **PPF Innovation** - First-ever adaptive persona system for academic podcasts
|
| 1161 |
- π€ **Autonomous Agent** - Intelligent planning, reasoning, and persona-aware execution
|
| 1162 |
- π§ **MCP Integration** - Tools as cognitive extensions for the agent
|
| 1163 |
+
- π¨ **Gradio 6 UX** - Glass-morphism design with intuitive search & persona controls
|
| 1164 |
- π **Real Impact** - Making research accessible and engaging for everyone
|
| 1165 |
|
| 1166 |
+
**Why PAD + PPF matter for this hackathon:** We didn't just build a tool β we invented **two new paradigms**. PAD solves the discovery problem (finding papers), PPF solves the consumption problem (understanding papers). Together, they create a **zero-friction pipeline** from curiosity to knowledge.
|
| 1167 |
|
| 1168 |
---
|
| 1169 |
|
| 1170 |
## π About the Agent
|
| 1171 |
|
| 1172 |
+
PaperCast's **discovery-aware, persona-driven autonomous agent** makes intelligent decisions at every step:
|
| 1173 |
|
| 1174 |
+
- **π Discovery Intelligence** - Orchestrates parallel API calls to multiple paper sources, ranks and deduplicates results
|
| 1175 |
- **π§ Persona Analysis** - Evaluates paper complexity and matches optimal persona mode
|
| 1176 |
- **π Strategic Planning** - Determines conversation flow based on selected persona (debate-style vs. teaching-style)
|
| 1177 |
- **π Character Orchestration** - Generates distinct personalities for each persona (Dr. Morgan β The Critic β Professor Chen)
|
|
|
|
| 1179 |
- **π£οΈ Dynamic Synthesis** - Maps persona characters to voice IDs automatically
|
| 1180 |
- **π Multi-Paper Intelligence** - Synthesizes insights across papers while maintaining persona consistency
|
| 1181 |
|
| 1182 |
+
**The key insight:** The agent doesn't just process papers β it **discovers and performs** them. PAD finds the perfect paper, PPF delivers it in your perfect style.
|
| 1183 |
|
| 1184 |
---
|
| 1185 |
|
| 1186 |
## π‘ Use Cases
|
| 1187 |
|
| 1188 |
### π§ **Learning & Education**
|
| 1189 |
+
- **PAD Search** β Find "transformer attention mechanisms" β Get 5 papers instantly
|
| 1190 |
- **Pedagogical mode** for complex topics you want to master
|
| 1191 |
- **Friendly Explainer** for quick overviews during commutes
|
| 1192 |
- **Interdisciplinary Clash** to understand papers outside your field
|
| 1193 |
|
| 1194 |
### π¬ **Research & Analysis**
|
| 1195 |
+
- **PAD Search** β Discover latest papers on your research topic
|
| 1196 |
- **Academic Debate** for critical evaluation of methodologies
|
| 1197 |
- **Savage Roast** to identify weak points and overstated claims
|
| 1198 |
+
- Quick paper screening before deep reading (60 seconds from search to audio)
|
| 1199 |
|
| 1200 |
### π **Accessibility**
|
| 1201 |
+
- **Zero barrier to entry** β No URLs, no downloads, just search and listen
|
| 1202 |
- Make cutting-edge research understandable for non-experts
|
| 1203 |
- Bridge knowledge gaps between disciplines
|
| 1204 |
- Learn through conversation, not dry text
|
| 1205 |
|
| 1206 |
### π **Entertainment**
|
| 1207 |
+
- **PAD + Savage Roast combo** β Find trending papers and roast them
|
| 1208 |
- Host paper "debate clubs" with Academic Debate mode
|
| 1209 |
- Share entertaining takes on research with Savage Roast clips
|
| 1210 |
|
|
|
|
| 1212 |
|
| 1213 |
## π What Makes Us Different
|
| 1214 |
|
| 1215 |
+
π **We built PAD from scratch** β First custom multi-source academic search engine designed for podcast generation. Parallel API orchestration, smart deduplication, zero-friction UX.
|
| 1216 |
+
|
| 1217 |
π **We invented PPF** β The Podcast Persona Framework is a **world-first innovation**. No other platform offers adaptive conversation personas.
|
| 1218 |
|
| 1219 |
+
β‘ **End-to-end innovation** β Most tools stop at URL β podcast. We solved **discovery + consumption** with two custom-built systems.
|
| 1220 |
+
|
| 1221 |
π§ **Real characters, not voices** β Other tools change tone. We create **distinct personalities** with names, perspectives, and consistent behavior.
|
| 1222 |
|
| 1223 |
+
π **60-second pipeline** β From search query ("diffusion models") to finished podcast in under a minute. No other platform comes close.
|
| 1224 |
|
| 1225 |
+
π§ **Built for flexibility** β Provider-agnostic design works with any LLM, any TTS, any infrastructure.
|
| 1226 |
|
| 1227 |
+
π― **User empowerment** β You choose what to listen to (PAD) and how to listen (PPF). Complete control over discovery and consumption.
|
| 1228 |
|
| 1229 |
+
**The bottom line:** Every other podcast generator is a one-trick pony. PaperCast is a **research discovery platform + repertory theater company** β we find papers you love and perform them your way.
|
| 1230 |
|
| 1231 |
---
|
| 1232 |
|
|
@@ -1 +1,23 @@
|
|
| 1 |
-
[
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
[
|
| 2 |
+
{
|
| 3 |
+
"url": "https://arxiv.org/abs/2511.20623",
|
| 4 |
+
"audio_path": "/home/batuhan/lab/papercast/output/podcast_20251126_141725.wav",
|
| 5 |
+
"script_length": 8,
|
| 6 |
+
"timestamp": "2025-11-26 14:17:25",
|
| 7 |
+
"audio_filename": "podcast_20251126_141725.wav"
|
| 8 |
+
},
|
| 9 |
+
{
|
| 10 |
+
"url": "Multiple papers: https://arxiv.org/pdf/2405.03150v2",
|
| 11 |
+
"audio_path": "/home/batuhan/lab/papercast/output/podcast_20251126_141940.wav",
|
| 12 |
+
"script_length": 15,
|
| 13 |
+
"timestamp": "2025-11-26 14:19:40",
|
| 14 |
+
"audio_filename": "podcast_20251126_141940.wav"
|
| 15 |
+
},
|
| 16 |
+
{
|
| 17 |
+
"url": "Multiple papers: https://arxiv.org/abs/2511.18514, https://arxiv.org/abs/2509.07203, Uploaded PDF: /tmp/gradio/19417dc5c6f35b380443f077940de9674d8ddc1e21b9224074d80c56f784fbeb/A Comprehensive Analysis of Solar-Powered Electric Vehicle Charging Infrastructure.pdf",
|
| 18 |
+
"audio_path": "/home/batuhan/lab/papercast/output/podcast_20251126_142132.wav",
|
| 19 |
+
"script_length": 16,
|
| 20 |
+
"timestamp": "2025-11-26 14:21:32",
|
| 21 |
+
"audio_filename": "podcast_20251126_142132.wav"
|
| 22 |
+
}
|
| 23 |
+
]
|
|
@@ -0,0 +1,309 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Paper Auto-Discovery (PAD) Module
|
| 3 |
+
|
| 4 |
+
Provides intelligent paper search across multiple sources:
|
| 5 |
+
- Semantic Scholar Graph API v1
|
| 6 |
+
- arXiv API
|
| 7 |
+
|
| 8 |
+
Aggregates results and provides unified interface for paper discovery.
|
| 9 |
+
"""
|
| 10 |
+
|
| 11 |
+
import requests
|
| 12 |
+
import xml.etree.ElementTree as ET
|
| 13 |
+
from typing import List, Dict, Optional
|
| 14 |
+
from concurrent.futures import ThreadPoolExecutor, as_completed
|
| 15 |
+
import logging
|
| 16 |
+
|
| 17 |
+
logger = logging.getLogger(__name__)
|
| 18 |
+
|
| 19 |
+
|
| 20 |
+
class PaperSearchResult:
|
| 21 |
+
"""Represents a single paper search result"""
|
| 22 |
+
|
| 23 |
+
def __init__(
|
| 24 |
+
self,
|
| 25 |
+
title: str,
|
| 26 |
+
authors: List[str],
|
| 27 |
+
year: Optional[int],
|
| 28 |
+
abstract: str,
|
| 29 |
+
url: str,
|
| 30 |
+
pdf_url: Optional[str],
|
| 31 |
+
source: str, # "semantic_scholar" or "arxiv"
|
| 32 |
+
paper_id: str,
|
| 33 |
+
):
|
| 34 |
+
self.title = title
|
| 35 |
+
self.authors = authors
|
| 36 |
+
self.year = year
|
| 37 |
+
self.abstract = abstract
|
| 38 |
+
self.url = url
|
| 39 |
+
self.pdf_url = pdf_url
|
| 40 |
+
self.source = source
|
| 41 |
+
self.paper_id = paper_id
|
| 42 |
+
|
| 43 |
+
def to_dict(self) -> Dict:
|
| 44 |
+
"""Convert to dictionary for easy JSON serialization"""
|
| 45 |
+
return {
|
| 46 |
+
"title": self.title,
|
| 47 |
+
"authors": self.authors,
|
| 48 |
+
"year": self.year,
|
| 49 |
+
"abstract": self.abstract,
|
| 50 |
+
"url": self.url,
|
| 51 |
+
"pdf_url": self.pdf_url,
|
| 52 |
+
"source": self.source,
|
| 53 |
+
"paper_id": self.paper_id,
|
| 54 |
+
}
|
| 55 |
+
|
| 56 |
+
def __repr__(self):
|
| 57 |
+
authors_str = ", ".join(self.authors[:3])
|
| 58 |
+
if len(self.authors) > 3:
|
| 59 |
+
authors_str += " et al."
|
| 60 |
+
return f"<PaperSearchResult: {self.title[:50]}... by {authors_str} ({self.year})>"
|
| 61 |
+
|
| 62 |
+
|
| 63 |
+
class PaperDiscoveryEngine:
|
| 64 |
+
"""
|
| 65 |
+
PAD - Paper Auto-Discovery Engine
|
| 66 |
+
|
| 67 |
+
Searches for research papers across multiple sources and returns
|
| 68 |
+
unified results with PDF links when available.
|
| 69 |
+
"""
|
| 70 |
+
|
| 71 |
+
SEMANTIC_SCHOLAR_API = "https://api.semanticscholar.org/graph/v1/paper/search"
|
| 72 |
+
ARXIV_API = "http://export.arxiv.org/api/query"
|
| 73 |
+
|
| 74 |
+
def __init__(self, max_results: int = 5):
|
| 75 |
+
self.max_results = max_results
|
| 76 |
+
self.session = requests.Session()
|
| 77 |
+
# Set user agent to avoid 403 errors
|
| 78 |
+
self.session.headers.update({
|
| 79 |
+
"User-Agent": "PaperCast/1.0 (Research Paper Discovery; batuhan@papercast.io)"
|
| 80 |
+
})
|
| 81 |
+
|
| 82 |
+
def search(self, query: str) -> List[PaperSearchResult]:
|
| 83 |
+
"""
|
| 84 |
+
Search for papers across all sources in parallel.
|
| 85 |
+
|
| 86 |
+
Args:
|
| 87 |
+
query: Search query (e.g., "diffusion models", "Grok reasoning")
|
| 88 |
+
|
| 89 |
+
Returns:
|
| 90 |
+
List of PaperSearchResult objects, sorted by relevance
|
| 91 |
+
"""
|
| 92 |
+
logger.info(f"PAD: Searching for '{query}'")
|
| 93 |
+
|
| 94 |
+
results = []
|
| 95 |
+
|
| 96 |
+
# Run both API calls in parallel for speed
|
| 97 |
+
with ThreadPoolExecutor(max_workers=2) as executor:
|
| 98 |
+
future_semantic = executor.submit(self._search_semantic_scholar, query)
|
| 99 |
+
future_arxiv = executor.submit(self._search_arxiv, query)
|
| 100 |
+
|
| 101 |
+
# Collect results as they complete
|
| 102 |
+
for future in as_completed([future_semantic, future_arxiv]):
|
| 103 |
+
try:
|
| 104 |
+
partial_results = future.result()
|
| 105 |
+
results.extend(partial_results)
|
| 106 |
+
except Exception as e:
|
| 107 |
+
logger.error(f"PAD: Search failed for one source: {e}")
|
| 108 |
+
|
| 109 |
+
# Deduplicate by title (case-insensitive)
|
| 110 |
+
seen_titles = set()
|
| 111 |
+
unique_results = []
|
| 112 |
+
for result in results:
|
| 113 |
+
title_lower = result.title.lower().strip()
|
| 114 |
+
if title_lower not in seen_titles:
|
| 115 |
+
seen_titles.add(title_lower)
|
| 116 |
+
unique_results.append(result)
|
| 117 |
+
|
| 118 |
+
# Limit to max_results
|
| 119 |
+
unique_results = unique_results[:self.max_results]
|
| 120 |
+
|
| 121 |
+
logger.info(f"PAD: Found {len(unique_results)} unique papers")
|
| 122 |
+
return unique_results
|
| 123 |
+
|
| 124 |
+
def _search_semantic_scholar(self, query: str) -> List[PaperSearchResult]:
|
| 125 |
+
"""Search Semantic Scholar Graph API v1"""
|
| 126 |
+
try:
|
| 127 |
+
logger.debug("PAD: Querying Semantic Scholar...")
|
| 128 |
+
|
| 129 |
+
params = {
|
| 130 |
+
"query": query,
|
| 131 |
+
"fields": "title,authors,year,abstract,openAccessPdf,url,paperId",
|
| 132 |
+
"limit": self.max_results,
|
| 133 |
+
}
|
| 134 |
+
|
| 135 |
+
response = self.session.get(
|
| 136 |
+
self.SEMANTIC_SCHOLAR_API,
|
| 137 |
+
params=params,
|
| 138 |
+
timeout=10
|
| 139 |
+
)
|
| 140 |
+
|
| 141 |
+
# Handle rate limiting gracefully - just skip Semantic Scholar
|
| 142 |
+
if response.status_code == 429:
|
| 143 |
+
logger.warning("PAD: Semantic Scholar rate limit exceeded (429). Relying on arXiv results.")
|
| 144 |
+
return []
|
| 145 |
+
|
| 146 |
+
response.raise_for_status()
|
| 147 |
+
|
| 148 |
+
data = response.json()
|
| 149 |
+
papers = data.get("data", [])
|
| 150 |
+
|
| 151 |
+
results = []
|
| 152 |
+
for paper in papers:
|
| 153 |
+
# Extract PDF URL if available
|
| 154 |
+
pdf_url = None
|
| 155 |
+
if paper.get("openAccessPdf"):
|
| 156 |
+
pdf_url = paper["openAccessPdf"].get("url")
|
| 157 |
+
|
| 158 |
+
# Extract author names
|
| 159 |
+
authors = []
|
| 160 |
+
for author in paper.get("authors", []):
|
| 161 |
+
if "name" in author:
|
| 162 |
+
authors.append(author["name"])
|
| 163 |
+
|
| 164 |
+
result = PaperSearchResult(
|
| 165 |
+
title=paper.get("title", "Untitled"),
|
| 166 |
+
authors=authors,
|
| 167 |
+
year=paper.get("year"),
|
| 168 |
+
abstract=paper.get("abstract", "No abstract available."),
|
| 169 |
+
url=paper.get("url", ""),
|
| 170 |
+
pdf_url=pdf_url,
|
| 171 |
+
source="semantic_scholar",
|
| 172 |
+
paper_id=paper.get("paperId", ""),
|
| 173 |
+
)
|
| 174 |
+
results.append(result)
|
| 175 |
+
|
| 176 |
+
logger.debug(f"PAD: Semantic Scholar returned {len(results)} papers")
|
| 177 |
+
return results
|
| 178 |
+
|
| 179 |
+
except Exception as e:
|
| 180 |
+
logger.error(f"PAD: Semantic Scholar search failed: {e}")
|
| 181 |
+
return []
|
| 182 |
+
|
| 183 |
+
def _search_arxiv(self, query: str) -> List[PaperSearchResult]:
|
| 184 |
+
"""Search arXiv API"""
|
| 185 |
+
try:
|
| 186 |
+
logger.debug("PAD: Querying arXiv...")
|
| 187 |
+
|
| 188 |
+
params = {
|
| 189 |
+
"search_query": f"all:{query}",
|
| 190 |
+
"max_results": self.max_results,
|
| 191 |
+
"sortBy": "relevance",
|
| 192 |
+
"sortOrder": "descending",
|
| 193 |
+
}
|
| 194 |
+
|
| 195 |
+
response = self.session.get(
|
| 196 |
+
self.ARXIV_API,
|
| 197 |
+
params=params,
|
| 198 |
+
timeout=10
|
| 199 |
+
)
|
| 200 |
+
response.raise_for_status()
|
| 201 |
+
|
| 202 |
+
# Parse XML response
|
| 203 |
+
root = ET.fromstring(response.content)
|
| 204 |
+
|
| 205 |
+
# Define namespace
|
| 206 |
+
ns = {
|
| 207 |
+
"atom": "http://www.w3.org/2005/Atom",
|
| 208 |
+
"arxiv": "http://arxiv.org/schemas/atom"
|
| 209 |
+
}
|
| 210 |
+
|
| 211 |
+
results = []
|
| 212 |
+
for entry in root.findall("atom:entry", ns):
|
| 213 |
+
# Extract title
|
| 214 |
+
title_elem = entry.find("atom:title", ns)
|
| 215 |
+
title = title_elem.text.strip() if title_elem is not None else "Untitled"
|
| 216 |
+
|
| 217 |
+
# Extract authors
|
| 218 |
+
authors = []
|
| 219 |
+
for author in entry.findall("atom:author", ns):
|
| 220 |
+
name_elem = author.find("atom:name", ns)
|
| 221 |
+
if name_elem is not None:
|
| 222 |
+
authors.append(name_elem.text.strip())
|
| 223 |
+
|
| 224 |
+
# Extract abstract
|
| 225 |
+
summary_elem = entry.find("atom:summary", ns)
|
| 226 |
+
abstract = summary_elem.text.strip() if summary_elem is not None else "No abstract available."
|
| 227 |
+
|
| 228 |
+
# Extract URL (abstract page)
|
| 229 |
+
url_elem = entry.find("atom:id", ns)
|
| 230 |
+
url = url_elem.text.strip() if url_elem is not None else ""
|
| 231 |
+
|
| 232 |
+
# Extract PDF URL
|
| 233 |
+
pdf_url = None
|
| 234 |
+
for link in entry.findall("atom:link", ns):
|
| 235 |
+
if link.get("type") == "application/pdf":
|
| 236 |
+
pdf_url = link.get("href")
|
| 237 |
+
break
|
| 238 |
+
|
| 239 |
+
# Extract year from published date
|
| 240 |
+
published_elem = entry.find("atom:published", ns)
|
| 241 |
+
year = None
|
| 242 |
+
if published_elem is not None:
|
| 243 |
+
try:
|
| 244 |
+
year = int(published_elem.text[:4])
|
| 245 |
+
except (ValueError, TypeError):
|
| 246 |
+
pass
|
| 247 |
+
|
| 248 |
+
# Extract arXiv ID
|
| 249 |
+
paper_id = url.split("/")[-1] if url else ""
|
| 250 |
+
|
| 251 |
+
result = PaperSearchResult(
|
| 252 |
+
title=title,
|
| 253 |
+
authors=authors,
|
| 254 |
+
year=year,
|
| 255 |
+
abstract=abstract,
|
| 256 |
+
url=url,
|
| 257 |
+
pdf_url=pdf_url,
|
| 258 |
+
source="arxiv",
|
| 259 |
+
paper_id=paper_id,
|
| 260 |
+
)
|
| 261 |
+
results.append(result)
|
| 262 |
+
|
| 263 |
+
logger.debug(f"PAD: arXiv returned {len(results)} papers")
|
| 264 |
+
return results
|
| 265 |
+
|
| 266 |
+
except Exception as e:
|
| 267 |
+
logger.error(f"PAD: arXiv search failed: {e}")
|
| 268 |
+
return []
|
| 269 |
+
|
| 270 |
+
def get_pdf_url(self, result: PaperSearchResult) -> Optional[str]:
|
| 271 |
+
"""
|
| 272 |
+
Get the best available PDF URL for a search result.
|
| 273 |
+
|
| 274 |
+
Returns direct PDF URL if available, otherwise returns the paper URL
|
| 275 |
+
which can be processed by the existing fetching logic.
|
| 276 |
+
"""
|
| 277 |
+
if result.pdf_url:
|
| 278 |
+
return result.pdf_url
|
| 279 |
+
|
| 280 |
+
# For arXiv papers without direct PDF link, construct it
|
| 281 |
+
if result.source == "arxiv" and result.url:
|
| 282 |
+
# Convert abstract URL to PDF URL
|
| 283 |
+
# https://arxiv.org/abs/2301.12345 -> https://arxiv.org/pdf/2301.12345.pdf
|
| 284 |
+
return result.url.replace("/abs/", "/pdf/") + ".pdf"
|
| 285 |
+
|
| 286 |
+
# Return the paper URL as fallback (existing logic can handle it)
|
| 287 |
+
return result.url
|
| 288 |
+
|
| 289 |
+
|
| 290 |
+
# Convenience function for easy import
|
| 291 |
+
def search_papers(query: str, max_results: int = 5) -> List[PaperSearchResult]:
|
| 292 |
+
"""
|
| 293 |
+
Search for research papers across multiple sources.
|
| 294 |
+
|
| 295 |
+
Args:
|
| 296 |
+
query: Search query (e.g., "diffusion models", "Grok reasoning")
|
| 297 |
+
max_results: Maximum number of results to return (default: 5)
|
| 298 |
+
|
| 299 |
+
Returns:
|
| 300 |
+
List of PaperSearchResult objects
|
| 301 |
+
|
| 302 |
+
Example:
|
| 303 |
+
>>> results = search_papers("transformer attention mechanisms")
|
| 304 |
+
>>> for paper in results:
|
| 305 |
+
>>> print(f"{paper.title} ({paper.year})")
|
| 306 |
+
>>> print(f" PDF: {paper.pdf_url}")
|
| 307 |
+
"""
|
| 308 |
+
engine = PaperDiscoveryEngine(max_results=max_results)
|
| 309 |
+
return engine.search(query)
|
|
@@ -1,91 +0,0 @@
|
|
| 1 |
-
# PaperCast New features implementations
|
| 2 |
-
|
| 3 |
-
## Vision
|
| 4 |
-
We are not building "another paper summarizer".
|
| 5 |
-
We are building **the world's first interactive, multi-modal, counterfactual-aware, visually-synced academic podcast studio** powered by MCP tools, Gradio 6, PyMuPDF, Semantic Scholar, arXiv and multi-provider TTS.
|
| 6 |
-
|
| 7 |
-
We invented 4 original frameworks that will be heavily emphasized in the demo and submission:
|
| 8 |
-
|
| 9 |
-
- **PPF** β Podcast Persona Framework
|
| 10 |
-
- **PVF** β Paper Visual Framework
|
| 11 |
-
- **PAD** β Paper Auto-Discovery
|
| 12 |
-
- **CPM** β Counterfactual Paper Mode
|
| 13 |
-
|
| 14 |
-
We will constantly refer to these acronyms in the demo:
|
| 15 |
-
"We created the Podcast Persona Framework (PPF) to solve the one-size-fits-all podcast problem" β instant "wow this is professional" effect.
|
| 16 |
-
|
| 17 |
-
## Core Features
|
| 18 |
-
|
| 19 |
-
### 1. Podcast Persona Framework (PPF) β Killer Feature #1
|
| 20 |
-
User selects persona via dropdown + optional custom text box.
|
| 21 |
-
|
| 22 |
-
Implemented modes (exact names):
|
| 23 |
-
|
| 24 |
-
1. **Friendly Explainer** β Current default (two friends casually discussing)
|
| 25 |
-
2. **Academic Debate** β One defends the paper, the other politely challenges ("This claim is strong, but Table 2 baseline seems weak...")
|
| 26 |
-
3. **Savage Roast** β One speaker brutally roasts the paper ("This ablation is an absolute clown show", "Figure 4 is statistical noise"), the other stubbornly defends it
|
| 27 |
-
4. **Pedagogical** β Speaker A = Professor, Speaker B = Curious Student (student constantly asks questions)
|
| 28 |
-
5. **Interdisciplinary Clash** β Speaker A = Domain Expert, Speaker B = Complete Outsider (e.g. biologist reading ML paper β "This neuron analogy makes zero biological sense")
|
| 29 |
-
|
| 30 |
-
|
| 31 |
-
### 2. Paper Auto-Discovery (PAD) β Killer Feature #2
|
| 32 |
-
|
| 33 |
-
Input methods:
|
| 34 |
-
- PDF upload
|
| 35 |
-
- Direct URL (arXiv, Semantic Scholar, HF, etc.)
|
| 36 |
-
(NEW INPUT METHOD) - Free text query β "Grok reasons about everything" or "diffusion survey 2025"
|
| 37 |
-
|
| 38 |
-
Workflow:
|
| 39 |
-
1. Agent calls **Semantic Scholar Graph v1 API** (`/paper/search?query=...&fields=title,authors,year,abstract,openAccessPdf,url`)
|
| 40 |
-
2. Parallel call to **arXiv API** (`http://export.arxiv.org/api/query?search_query=...`)
|
| 41 |
-
3. Collect top 5 results β show user title + abstract + year + source in gr.Radio or clickable cards
|
| 42 |
-
4. User selects β if openAccessPdf exists β download directly β PyMuPDF extract
|
| 43 |
-
5. Otherwise fetch from arXiv
|
| 44 |
-
|
| 45 |
-
Zero friction paper discovery.
|
| 46 |
-
|
| 47 |
-
### 3. Paper Visual Framework (PVF) β Killer Feature #3 (Jury will lose their minds)
|
| 48 |
-
Right column of Gradio interface shows embedded PDF viewer (PDF.js).
|
| 49 |
-
|
| 50 |
-
- PDF viewer shows the original paper alongside the podcast
|
| 51 |
-
- When speakers reference sections β users can view in the PDF
|
| 52 |
-
- Transcript entries become clickable timestamps that jump to relevant sections
|
| 53 |
-
- Implementation: ElevenLabs streaming β parse chunk for figure/table mentions β emit JS event β PDF.js control
|
| 54 |
-
|
| 55 |
-
This single feature wins "Best UX" + "Most Innovative" categories alone.
|
| 56 |
-
|
| 57 |
-
### 4. Counterfactual Paper Mode ("What If?")
|
| 58 |
-
Post-podcast button:
|
| 59 |
-
"What if this paper was written by Yann LeCun? / in 2012? / if GPT-4 never existed? / by DeepMind instead of OpenAI?"
|
| 60 |
-
|
| 61 |
-
β Claude re-writes/re-interprets the same paper in alternate reality β new podcast generated.
|
| 62 |
-
Extremely fun, extremely memorable, extremely shareable.
|
| 63 |
-
|
| 64 |
-
### 5. Ultra Transcript System
|
| 65 |
-
- Timestamped (00:00:12)
|
| 66 |
-
- Speaker-labeled (Savage Critic:, Professor:, etc.)
|
| 67 |
-
- Clickable figure/table references (syncs with PVF)
|
| 68 |
-
- LaTeX equations rendered via MathJax
|
| 69 |
-
- Download buttons: .txt, .srt, .docx, .vtt
|
| 70 |
-
- Bonus: "Copy as tweet" β auto-selects the 3 spiciest quotes with citation
|
| 71 |
-
|
| 72 |
-
## Final UI Layout (Gradio 6)
|
| 73 |
-
```python
|
| 74 |
-
with gr.Row():
|
| 75 |
-
with gr.Column(scale=3):
|
| 76 |
-
chatbot = gr.Chatbot(height=700, render=True)
|
| 77 |
-
controls = gr.Row() # query input + PPF dropdown + custom persona + buttons
|
| 78 |
-
audio_player = gr.Audio(autoplay=True, streaming=True)
|
| 79 |
-
transcript = gr.Markdown()
|
| 80 |
-
with gr.Column(scale=2):
|
| 81 |
-
pdf_viewer = gr.HTML() # PVF - embedded PDF.js
|
| 82 |
-
timeline_vis = gr.HTML() # PET timeline
|
| 83 |
-
|
| 84 |
-
|
| 85 |
-
Required Tools
|
| 86 |
-
|
| 87 |
-
extract_pdf_text β PyMuPDF text extraction (lightweight)
|
| 88 |
-
search_semantic_scholar β returns json (future feature)
|
| 89 |
-
search_arxiv β returns json (future feature)
|
| 90 |
-
fetch_pdf_from_url β returns bytes
|
| 91 |
-
batch_extract_papers (for PET)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|