mishrabp commited on
Commit
e9c7617
Β·
verified Β·
1 Parent(s): 36173c8

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +142 -2
README.md CHANGED
@@ -22,8 +22,6 @@ To achieve this, the project integrates the following technologies and AI featur
22
  - **SendGrid** (for emailing report)
23
  - **LLMs** - (OpenAI, Geminia, Groq)
24
 
25
- **URL:** https://huggingface.co/spaces/mishrabp/deep-research
26
-
27
  ## How it works?
28
  The system is a multi-agent solution, where each agent has a specific responsibility:
29
 
@@ -49,3 +47,145 @@ The system is a multi-agent solution, where each agent has a specific responsibi
49
  - The entry point of the system.
50
  - Facilitates communication and workflow between all agents.
51
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
  - **SendGrid** (for emailing report)
23
  - **LLMs** - (OpenAI, Geminia, Groq)
24
 
 
 
25
  ## How it works?
26
  The system is a multi-agent solution, where each agent has a specific responsibility:
27
 
 
47
  - The entry point of the system.
48
  - Facilitates communication and workflow between all agents.
49
 
50
+ ## Project Folder Structure
51
+
52
+ ```
53
+ deep-research/
54
+ β”œβ”€β”€ ui/
55
+ β”‚ β”œβ”€β”€ app.py # Main Streamlit application entry point
56
+ β”‚ └── __pycache__/ # Python bytecode cache
57
+ β”œβ”€β”€ appagents/
58
+ β”‚ β”œβ”€β”€ __init__.py # Package initialization
59
+ β”‚ β”œβ”€β”€ orchestrator.py # Orchestrator agent - coordinates all agents
60
+ β”‚ β”œβ”€β”€ planner_agent.py # Planner agent - builds structured query plans
61
+ β”‚ β”œβ”€β”€ guardrail_agent.py # Guardrail agent - validates user input
62
+ β”‚ β”œβ”€β”€ search_agent.py # Search agent - performs web searches
63
+ β”‚ β”œβ”€β”€ writer_agent.py # Writer agent - generates consolidated reports
64
+ β”‚ β”œβ”€β”€ email_agent.py # Email agent - sends reports via email (not functional)
65
+ β”‚ └── __pycache__/ # Python bytecode cache
66
+ β”œβ”€β”€ core/
67
+ β”‚ β”œβ”€β”€ __init__.py # Package initialization
68
+ β”‚ β”œβ”€β”€ logger.py # Centralized logging configuration
69
+ β”‚ └── __pycache__/ # Python bytecode cache
70
+ β”œβ”€β”€ tools/
71
+ β”‚ β”œβ”€β”€ __init__.py # Package initialization
72
+ β”‚ β”œβ”€β”€ google_tools.py # Google search utilities
73
+ β”‚ β”œβ”€β”€ time_tools.py # Time-related utility functions
74
+ β”‚ └── __pycache__/ # Python bytecode cache
75
+ β”œβ”€β”€ prompts/
76
+ β”‚ β”œβ”€β”€ __init__.py # Package initialization (if present)
77
+ β”‚ β”œβ”€β”€ planner_prompt.txt # Prompt for planner agent (if present)
78
+ β”‚ β”œβ”€β”€ guardrail_prompt.txt # Prompt for guardrail agent (if present)
79
+ β”‚ β”œβ”€β”€ search_prompt.txt # Prompt for search agent (if present)
80
+ β”‚ └── writer_prompt.txt # Prompt for writer agent (if present)
81
+ β”œβ”€β”€ Dockerfile # Docker configuration for container deployment
82
+ β”œβ”€β”€ pyproject.toml # Project metadata and dependencies (copied from root)
83
+ β”œβ”€β”€ uv.lock # Locked dependency versions (copied from root)
84
+ β”œβ”€β”€ README.md # Project documentation
85
+ └── run.py # Script to run the application locally (if present)
86
+ ```
87
+
88
+ ## File Descriptions
89
+
90
+ ### UI Layer (`ui/`)
91
+ - **app.py** - Main Streamlit web application that provides the user interface. Handles:
92
+ - Text input for research queries
93
+ - Run/Download buttons (PDF, Markdown)
94
+ - Real-time streaming of results
95
+ - Display of final research reports
96
+ - Session state management
97
+ - Button enable/disable during streaming
98
+
99
+ ### Agents (`appagents/`)
100
+ - **orchestrator.py** - Central coordinator that:
101
+ - Manages the multi-agent workflow
102
+ - Handles communication between all agents
103
+ - Streams results back to the UI
104
+ - Implements the research pipeline
105
+
106
+ - **planner_agent.py** - Creates a structured plan for the query:
107
+ - Breaks down user query into actionable research steps
108
+ - Defines search queries and research angles
109
+
110
+ - **guardrail_agent.py** - Validates user input:
111
+ - Checks for inappropriate content
112
+ - Ensures compliance with policies
113
+ - Stops workflow if violations detected
114
+
115
+ - **search_agent.py** - Executes web searches:
116
+ - Performs parallel web searches
117
+ - Integrates with Google Search / Serper API
118
+ - Gathers raw research data
119
+
120
+ - **writer_agent.py** - Generates final report:
121
+ - Consolidates search results
122
+ - Formats findings into structured markdown
123
+ - Creates well-organized research summaries
124
+
125
+ - **email_agent.py** - Email delivery (not functional):
126
+ - Intended to send reports via SendGrid
127
+ - Currently not integrated in the workflow
128
+
129
+ ### Core Utilities (`core/`)
130
+ - **logger.py** - Centralized logging configuration:
131
+ - Provides consistent logging across agents
132
+ - Handles log levels and formatting
133
+
134
+ ### Tools (`tools/`)
135
+ - **google_tools.py** - Google/Serper API wrapper:
136
+ - Executes web searches
137
+ - Handles API authentication and response parsing
138
+
139
+ - **time_tools.py** - Utility functions:
140
+ - Time-related operations
141
+ - Timestamp management
142
+
143
+ ### Configuration Files
144
+ - **Dockerfile** - Container deployment:
145
+ - Builds Docker image with Python 3.12
146
+ - Installs dependencies using `uv`
147
+ - Sets up Streamlit server on port 7860
148
+ - Configures PYTHONPATH for module imports
149
+
150
+ - **pyproject.toml** - Project metadata:
151
+ - Package name: "agents"
152
+ - Python version requirement: 3.12
153
+ - Lists all dependencies (OpenAI, LangChain, Streamlit, etc.)
154
+
155
+ - **uv.lock** - Dependency lock file:
156
+ - Ensures reproducible builds
157
+ - Pins exact versions of all dependencies
158
+
159
+ ## Key Technologies
160
+
161
+ | Component | Technology | Purpose |
162
+ |-----------|-----------|---------|
163
+ | LLM Framework | OpenAI Agents | Multi-agent orchestration |
164
+ | Web Search | Serper API / Google Search | Research data gathering |
165
+ | Web UI | Streamlit | User interface and interaction |
166
+ | Document Export | ReportLab | PDF generation from markdown |
167
+ | Async Operations | AsyncIO | Parallel agent execution |
168
+ | Dependencies | UV | Fast Python package management |
169
+ | Containerization | Docker | Cloud deployment |
170
+
171
+ ## Running Locally
172
+
173
+ ```bash
174
+ # Install dependencies
175
+ uv sync
176
+
177
+ # Set environment variables defined in .env.name file
178
+ export OPENAI_API_KEY="your-key"
179
+ export SERPER_API_KEY="your-key"
180
+
181
+ # Run the Streamlit app
182
+ python run.py
183
+ ```
184
+
185
+ ## Deployment
186
+
187
+ The project is deployed on Hugging Face Spaces as a Docker container:
188
+ - **Space**: https://huggingface.co/spaces/mishrabp/deep-research
189
+ - **URL**: https://huggingface.co/spaces/mishrabp/deep-research
190
+ - **Trigger**: Automatic deployment on push to `main` branch
191
+ - **Configuration**: `.github/workflows/deep-research-app-hf.yml`