Spaces:
Runtime error
Runtime error
Commit
Β·
e22dcc4
0
Parent(s):
Iniital commit
Browse files- .gitignore +42 -0
- LICENSE +11 -0
- README.md +279 -0
- backend/Dockerfile +30 -0
- backend/app/__init__.py +1 -0
- backend/app/api/__init__.py +1 -0
- backend/app/api/endpoints/__init__.py +1 -0
- backend/app/api/endpoints/chat.py +204 -0
- backend/app/api/endpoints/documents.py +216 -0
- backend/app/core/__init__.py +1 -0
- backend/app/core/config.py +47 -0
- backend/app/core/database.py +29 -0
- backend/app/models/__init__.py +1 -0
- backend/app/models/document.py +35 -0
- backend/app/schemas/__init__.py +1 -0
- backend/app/schemas/chat.py +50 -0
- backend/app/schemas/document.py +40 -0
- backend/app/services/__init__.py +1 -0
- backend/app/services/ai_service.py +267 -0
- backend/app/services/pdf_processor.py +105 -0
- backend/app/services/vector_store.py +194 -0
- backend/main.py +125 -0
- backend/requirements.txt +19 -0
- backend/test_openrouter.py +93 -0
- docker-compose.yml +42 -0
- frontend/Dockerfile +25 -0
- frontend/app/globals.css +107 -0
- frontend/app/layout.tsx +26 -0
- frontend/app/page.tsx +128 -0
- frontend/components/ChatInterface.tsx +240 -0
- frontend/components/DocumentList.tsx +223 -0
- frontend/components/DocumentUpload.tsx +199 -0
- frontend/lib/api.ts +109 -0
- frontend/lib/store.ts +50 -0
- frontend/next-env.d.ts +5 -0
- frontend/next.config.js +19 -0
- frontend/package-lock.json +0 -0
- frontend/package.json +44 -0
- frontend/postcss.config.js +6 -0
- frontend/tailwind.config.js +76 -0
- frontend/tsconfig.json +28 -0
- setup.ps1 +96 -0
- setup.sh +90 -0
.gitignore
ADDED
@@ -0,0 +1,42 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# --- Python Backend ---
|
2 |
+
__pycache__/
|
3 |
+
*.pyc
|
4 |
+
*.pyo
|
5 |
+
*.pyd
|
6 |
+
venv/
|
7 |
+
ENV/
|
8 |
+
.env
|
9 |
+
.venv/
|
10 |
+
*.db
|
11 |
+
*.sqlite3
|
12 |
+
chroma_db/
|
13 |
+
uploads/
|
14 |
+
.vscode/
|
15 |
+
.DS_Store
|
16 |
+
Thumbs.db
|
17 |
+
*.log
|
18 |
+
*.ipynb
|
19 |
+
.pytest_cache/
|
20 |
+
dist/
|
21 |
+
build/
|
22 |
+
*.egg-info/
|
23 |
+
htmlcov/
|
24 |
+
.coverage
|
25 |
+
coverage.xml
|
26 |
+
|
27 |
+
# --- Frontend (Next.js/React) ---
|
28 |
+
node_modules/
|
29 |
+
.next/
|
30 |
+
out/
|
31 |
+
.env
|
32 |
+
.env.*
|
33 |
+
npm-debug.log*
|
34 |
+
yarn-debug.log*
|
35 |
+
yarn-error.log*
|
36 |
+
dist/
|
37 |
+
coverage/
|
38 |
+
|
39 |
+
# --- General ---
|
40 |
+
# Ignore Docker build cache
|
41 |
+
# Ignore coverage reports
|
42 |
+
# Ignore node_modules if any
|
LICENSE
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
UNLICENSED β VIEW ONLY
|
2 |
+
|
3 |
+
Copyright (c) 2025 Al Amin
|
4 |
+
|
5 |
+
Permission is NOT granted to any person obtaining a copy of this software and associated documentation files (the "Software") to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software.
|
6 |
+
|
7 |
+
The Software is provided strictly for viewing and educational reference purposes only.
|
8 |
+
|
9 |
+
The above copyright notice must be retained, and this notice must appear in all copies or substantial portions of the Software.
|
10 |
+
|
11 |
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY CLAIM, DAMAGES, OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE VIEWING OF THE SOFTWARE.
|
README.md
ADDED
@@ -0,0 +1,279 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# PDF-Based Q&A Chatbot System
|
2 |
+
|
3 |
+
A comprehensive end-to-end PDF-based Q&A chatbot system that processes uploaded PDF documents and enables users to retrieve accurate, context-aware answers via natural language queries.
|
4 |
+
|
5 |
+
## Features
|
6 |
+
|
7 |
+
- **PDF Processing**: Extract text and metadata from uploaded PDF documents
|
8 |
+
- **Vector Storage**: Store document embeddings in ChromaDB for efficient retrieval
|
9 |
+
- **AI-Powered Q&A**: Use OpenAI/Claude for intelligent question answering
|
10 |
+
- **Modern UI**: Clean, responsive interface built with Next.js and Tailwind CSS
|
11 |
+
- **Real-time Chat**: Interactive chat interface with conversation history
|
12 |
+
- **File Management**: Upload, view, and manage multiple PDF documents
|
13 |
+
- **Context Awareness**: Maintain conversation context and document references
|
14 |
+
|
15 |
+
## Tech Stack
|
16 |
+
|
17 |
+
### Backend
|
18 |
+
- **FastAPI**: High-performance web framework
|
19 |
+
- **PyPDF2**: PDF text extraction
|
20 |
+
- **ChromaDB**: Vector database for embeddings
|
21 |
+
- **OpenAI/Claude**: AI language models for Q&A
|
22 |
+
- **SQLAlchemy**: Database ORM
|
23 |
+
- **Pydantic**: Data validation
|
24 |
+
|
25 |
+
### Frontend
|
26 |
+
- **Next.js 14**: React framework with App Router
|
27 |
+
- **TypeScript**: Type-safe development
|
28 |
+
- **Tailwind CSS**: Utility-first styling
|
29 |
+
- **Shadcn/ui**: Modern UI components
|
30 |
+
- **React Hook Form**: Form handling
|
31 |
+
- **Zustand**: State management
|
32 |
+
|
33 |
+
## Project Structure
|
34 |
+
|
35 |
+
```
|
36 |
+
ChatbotCursor/
|
37 |
+
βββ backend/
|
38 |
+
β βββ app/
|
39 |
+
β β βββ api/
|
40 |
+
β β βββ core/
|
41 |
+
β β βββ models/
|
42 |
+
β β βββ services/
|
43 |
+
β β βββ utils/
|
44 |
+
β βββ requirements.txt
|
45 |
+
β βββ main.py
|
46 |
+
βββ frontend/
|
47 |
+
β βββ app/
|
48 |
+
β βββ components/
|
49 |
+
β βββ lib/
|
50 |
+
β βββ package.json
|
51 |
+
βββ docker-compose.yml
|
52 |
+
βββ README.md
|
53 |
+
```
|
54 |
+
|
55 |
+
## Quick Start
|
56 |
+
|
57 |
+
### Option 1: Automated Setup (Recommended)
|
58 |
+
|
59 |
+
**For Linux/macOS:**
|
60 |
+
```bash
|
61 |
+
chmod +x setup.sh
|
62 |
+
./setup.sh
|
63 |
+
```
|
64 |
+
|
65 |
+
**For Windows:**
|
66 |
+
```powershell
|
67 |
+
.\setup.ps1
|
68 |
+
```
|
69 |
+
|
70 |
+
### Option 2: Manual Setup
|
71 |
+
|
72 |
+
1. **Clone and Setup**
|
73 |
+
```bash
|
74 |
+
cd ChatbotCursor
|
75 |
+
```
|
76 |
+
|
77 |
+
2. **Backend Setup**
|
78 |
+
```bash
|
79 |
+
cd backend
|
80 |
+
python -m venv venv
|
81 |
+
source venv/bin/activate # On Windows: venv\Scripts\activate
|
82 |
+
pip install -r requirements.txt
|
83 |
+
cp .env.example .env
|
84 |
+
# Edit .env and add your API keys
|
85 |
+
```
|
86 |
+
|
87 |
+
3. **Frontend Setup**
|
88 |
+
```bash
|
89 |
+
cd frontend
|
90 |
+
npm install
|
91 |
+
cp .env.example .env
|
92 |
+
```
|
93 |
+
|
94 |
+
4. **Environment Variables**
|
95 |
+
- Edit `backend/.env` and add your API keys:
|
96 |
+
- `OPENAI_API_KEY` or `ANTHROPIC_API_KEY`
|
97 |
+
- The frontend `.env` should work with defaults
|
98 |
+
|
99 |
+
5. **Run the Application**
|
100 |
+
```bash
|
101 |
+
# Backend (Terminal 1)
|
102 |
+
cd backend
|
103 |
+
source venv/bin/activate # On Windows: venv\Scripts\activate
|
104 |
+
uvicorn main:app --reload
|
105 |
+
|
106 |
+
# Frontend (Terminal 2)
|
107 |
+
cd frontend
|
108 |
+
npm run dev
|
109 |
+
```
|
110 |
+
|
111 |
+
### Option 3: Docker Setup
|
112 |
+
|
113 |
+
```bash
|
114 |
+
# Build and run with Docker Compose
|
115 |
+
docker-compose up --build
|
116 |
+
|
117 |
+
# Or run services individually
|
118 |
+
docker-compose up backend
|
119 |
+
docker-compose up frontend
|
120 |
+
```
|
121 |
+
|
122 |
+
6. **Access the Application**
|
123 |
+
- Frontend: http://localhost:3000
|
124 |
+
- Backend API: http://localhost:8000
|
125 |
+
- API Documentation: http://localhost:8000/docs
|
126 |
+
|
127 |
+
## Usage
|
128 |
+
|
129 |
+
### Getting Started
|
130 |
+
|
131 |
+
1. **Upload Documents**
|
132 |
+
- Navigate to the "Documents" tab
|
133 |
+
- Drag and drop PDF files or click to select
|
134 |
+
- Wait for processing (text extraction and vector embedding)
|
135 |
+
- View upload status and document statistics
|
136 |
+
|
137 |
+
2. **Start Chatting**
|
138 |
+
- Switch to the "Chat" tab
|
139 |
+
- Ask questions about your uploaded documents
|
140 |
+
- Get AI-powered answers with source references
|
141 |
+
- View conversation history
|
142 |
+
|
143 |
+
3. **Document Management**
|
144 |
+
- View all uploaded documents with metadata
|
145 |
+
- Delete documents when no longer needed
|
146 |
+
- Monitor processing status and file sizes
|
147 |
+
|
148 |
+
### Features
|
149 |
+
|
150 |
+
- **Smart Document Processing**: Automatic text extraction and chunking
|
151 |
+
- **Vector Search**: Semantic similarity search for relevant content
|
152 |
+
- **AI-Powered Q&A**: Context-aware answers using OpenAI or Claude
|
153 |
+
- **Source Citations**: See which documents and sections were referenced
|
154 |
+
- **Conversation History**: Persistent chat sessions
|
155 |
+
- **File Management**: Upload, view, and delete documents
|
156 |
+
- **Real-time Processing**: Live status updates during uploads
|
157 |
+
|
158 |
+
### Supported File Types
|
159 |
+
|
160 |
+
- **PDF Documents**: All standard PDF files
|
161 |
+
- **Maximum Size**: 10MB per file
|
162 |
+
- **Processing**: Automatic text extraction and metadata parsing
|
163 |
+
|
164 |
+
## API Endpoints
|
165 |
+
|
166 |
+
### Document Management
|
167 |
+
- `POST /api/v1/documents/upload`: Upload PDF documents
|
168 |
+
- `GET /api/v1/documents/`: List all documents
|
169 |
+
- `GET /api/v1/documents/{id}`: Get specific document
|
170 |
+
- `DELETE /api/v1/documents/{id}`: Delete a document
|
171 |
+
- `GET /api/v1/documents/stats/summary`: Get document statistics
|
172 |
+
|
173 |
+
### Chat & Q&A
|
174 |
+
- `POST /api/v1/chat/`: Send questions and get answers
|
175 |
+
- `GET /api/v1/chat/history/{session_id}`: Get chat history
|
176 |
+
- `POST /api/v1/chat/session/new`: Create new chat session
|
177 |
+
- `GET /api/v1/chat/sessions`: List all sessions
|
178 |
+
- `DELETE /api/v1/chat/session/{session_id}`: Delete session
|
179 |
+
- `GET /api/v1/chat/models/available`: Get available AI models
|
180 |
+
|
181 |
+
### System
|
182 |
+
- `GET /health`: Health check
|
183 |
+
- `GET /docs`: Interactive API documentation (Swagger UI)
|
184 |
+
- `GET /redoc`: Alternative API documentation
|
185 |
+
|
186 |
+
## Configuration
|
187 |
+
|
188 |
+
### Environment Variables
|
189 |
+
|
190 |
+
**Backend (.env):**
|
191 |
+
```env
|
192 |
+
# Required: Set at least one AI provider
|
193 |
+
OPENAI_API_KEY=your-openai-api-key
|
194 |
+
ANTHROPIC_API_KEY=your-anthropic-api-key
|
195 |
+
|
196 |
+
# Optional: Customize settings
|
197 |
+
DATABASE_URL=sqlite:///./pdf_chatbot.db
|
198 |
+
CHROMA_PERSIST_DIRECTORY=./chroma_db
|
199 |
+
UPLOAD_DIR=./uploads
|
200 |
+
MAX_FILE_SIZE=10485760
|
201 |
+
```
|
202 |
+
|
203 |
+
**Frontend (.env):**
|
204 |
+
```env
|
205 |
+
NEXT_PUBLIC_API_URL=http://localhost:8000
|
206 |
+
```
|
207 |
+
|
208 |
+
### AI Provider Setup
|
209 |
+
|
210 |
+
1. **OpenAI**: Get API key from [OpenAI Platform](https://platform.openai.com/)
|
211 |
+
2. **Anthropic**: Get API key from [Anthropic Console](https://console.anthropic.com/)
|
212 |
+
|
213 |
+
## Development
|
214 |
+
|
215 |
+
### Backend Development
|
216 |
+
```bash
|
217 |
+
cd backend
|
218 |
+
source venv/bin/activate
|
219 |
+
uvicorn main:app --reload --port 8000
|
220 |
+
```
|
221 |
+
|
222 |
+
### Frontend Development
|
223 |
+
```bash
|
224 |
+
cd frontend
|
225 |
+
npm run dev
|
226 |
+
```
|
227 |
+
|
228 |
+
### Testing
|
229 |
+
```bash
|
230 |
+
# Backend tests
|
231 |
+
cd backend
|
232 |
+
pytest
|
233 |
+
|
234 |
+
# Frontend tests
|
235 |
+
cd frontend
|
236 |
+
npm test
|
237 |
+
```
|
238 |
+
|
239 |
+
## Troubleshooting
|
240 |
+
|
241 |
+
### Common Issues
|
242 |
+
|
243 |
+
1. **API Key Not Configured**
|
244 |
+
- Ensure you've added your API key to `backend/.env`
|
245 |
+
- Restart the backend server after changing environment variables
|
246 |
+
|
247 |
+
2. **Upload Fails**
|
248 |
+
- Check file size (max 10MB)
|
249 |
+
- Ensure file is a valid PDF
|
250 |
+
- Check backend logs for detailed error messages
|
251 |
+
|
252 |
+
3. **Chat Not Working**
|
253 |
+
- Verify AI service is configured and working
|
254 |
+
- Check if documents are properly processed
|
255 |
+
- Review browser console for frontend errors
|
256 |
+
|
257 |
+
4. **Docker Issues**
|
258 |
+
- Ensure Docker and Docker Compose are installed
|
259 |
+
- Check if ports 3000 and 8000 are available
|
260 |
+
- Use `docker-compose logs` to view service logs
|
261 |
+
|
262 |
+
## Contributing
|
263 |
+
|
264 |
+
1. Fork the repository
|
265 |
+
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
|
266 |
+
3. Commit your changes (`git commit -m 'Add amazing feature'`)
|
267 |
+
4. Push to the branch (`git push origin feature/amazing-feature`)
|
268 |
+
5. Open a Pull Request
|
269 |
+
|
270 |
+
## License
|
271 |
+
|
272 |
+
This project is unlicensed - see the [LICENSE](LICENSE) file for details.
|
273 |
+
|
274 |
+
## Acknowledgments
|
275 |
+
|
276 |
+
- Built with [FastAPI](https://fastapi.tiangolo.com/) and [Next.js](https://nextjs.org/)
|
277 |
+
- Vector storage powered by [ChromaDB](https://www.trychroma.com/)
|
278 |
+
- AI capabilities provided by [OpenAI](https://openai.com/) and [Anthropic](https://www.anthropic.com/)
|
279 |
+
- UI components from [Tailwind CSS](https://tailwindcss.com/) and [Lucide React](https://lucide.dev/)
|
backend/Dockerfile
ADDED
@@ -0,0 +1,30 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
FROM python:3.11-slim
|
2 |
+
|
3 |
+
WORKDIR /app
|
4 |
+
|
5 |
+
# Install system dependencies
|
6 |
+
RUN apt-get update && apt-get install -y \
|
7 |
+
gcc \
|
8 |
+
&& rm -rf /var/lib/apt/lists/*
|
9 |
+
|
10 |
+
# Copy requirements first for better caching
|
11 |
+
COPY requirements.txt .
|
12 |
+
|
13 |
+
# Install Python dependencies
|
14 |
+
RUN pip install --no-cache-dir -r requirements.txt
|
15 |
+
|
16 |
+
# Copy application code
|
17 |
+
COPY . .
|
18 |
+
|
19 |
+
# Create necessary directories
|
20 |
+
RUN mkdir -p uploads chroma_db
|
21 |
+
|
22 |
+
# Expose port
|
23 |
+
EXPOSE 8000
|
24 |
+
|
25 |
+
# Health check
|
26 |
+
HEALTHCHECK --interval=30s --timeout=30s --start-period=5s --retries=3 \
|
27 |
+
CMD curl -f http://localhost:8000/health || exit 1
|
28 |
+
|
29 |
+
# Run the application
|
30 |
+
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
|
backend/app/__init__.py
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
# App package
|
backend/app/api/__init__.py
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
# API package
|
backend/app/api/endpoints/__init__.py
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
# Endpoints package
|
backend/app/api/endpoints/chat.py
ADDED
@@ -0,0 +1,204 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from fastapi import APIRouter, Depends, HTTPException
|
2 |
+
from sqlalchemy.orm import Session
|
3 |
+
from typing import List
|
4 |
+
import json
|
5 |
+
import uuid
|
6 |
+
from datetime import datetime
|
7 |
+
|
8 |
+
from app.core.database import get_db
|
9 |
+
from app.models.document import ChatMessage
|
10 |
+
from app.schemas.chat import ChatRequest, ChatResponse, ChatHistoryResponse, ChatMessageCreate, ChatMessageResponse
|
11 |
+
from app.services.vector_store import VectorStore
|
12 |
+
from app.services.ai_service import AIService
|
13 |
+
|
14 |
+
router = APIRouter()
|
15 |
+
vector_store = VectorStore()
|
16 |
+
ai_service = AIService()
|
17 |
+
|
18 |
+
|
19 |
+
@router.post("/", response_model=ChatResponse)
|
20 |
+
def chat_with_documents(
|
21 |
+
request: ChatRequest,
|
22 |
+
db: Session = Depends(get_db)
|
23 |
+
):
|
24 |
+
"""Send a question and get an answer based on uploaded documents"""
|
25 |
+
try:
|
26 |
+
# Check if AI service is configured
|
27 |
+
if not ai_service.is_configured():
|
28 |
+
raise HTTPException(
|
29 |
+
status_code=503,
|
30 |
+
detail="AI service not configured. Please set up OpenAI or Anthropic API keys."
|
31 |
+
)
|
32 |
+
|
33 |
+
# Search for relevant documents
|
34 |
+
print(f"Searching for documents with query: {request.question}")
|
35 |
+
context_documents = vector_store.search_similar(request.question, n_results=5, document_id=str(request.document_id) if request.document_id is not None else None)
|
36 |
+
print(f"Found {len(context_documents)} relevant documents")
|
37 |
+
if context_documents:
|
38 |
+
print(f"Document IDs found: {[doc.get('metadata', {}).get('document_id') for doc in context_documents]}")
|
39 |
+
else:
|
40 |
+
print("No documents found in vector store")
|
41 |
+
# Check vector store stats
|
42 |
+
stats = vector_store.get_collection_stats()
|
43 |
+
print(f"Vector store stats: {stats}")
|
44 |
+
|
45 |
+
if not context_documents:
|
46 |
+
# No relevant documents found
|
47 |
+
answer = "I don't have enough information in the uploaded documents to answer your question. Please upload relevant PDF documents first."
|
48 |
+
|
49 |
+
# Save user message
|
50 |
+
user_message = ChatMessage(
|
51 |
+
session_id=request.session_id,
|
52 |
+
message_type="user",
|
53 |
+
content=request.question
|
54 |
+
)
|
55 |
+
db.add(user_message)
|
56 |
+
db.commit()
|
57 |
+
|
58 |
+
# Save assistant message
|
59 |
+
assistant_message = ChatMessage(
|
60 |
+
session_id=request.session_id,
|
61 |
+
message_type="assistant",
|
62 |
+
content=answer
|
63 |
+
)
|
64 |
+
db.add(assistant_message)
|
65 |
+
db.commit()
|
66 |
+
|
67 |
+
return ChatResponse(
|
68 |
+
success=True,
|
69 |
+
answer=answer,
|
70 |
+
session_id=request.session_id,
|
71 |
+
message_id=assistant_message.id
|
72 |
+
)
|
73 |
+
|
74 |
+
# Generate answer using AI
|
75 |
+
ai_response = ai_service.generate_answer(
|
76 |
+
request.question,
|
77 |
+
context_documents,
|
78 |
+
model=request.model
|
79 |
+
)
|
80 |
+
|
81 |
+
# Save user message
|
82 |
+
user_message = ChatMessage(
|
83 |
+
session_id=request.session_id,
|
84 |
+
message_type="user",
|
85 |
+
content=request.question
|
86 |
+
)
|
87 |
+
db.add(user_message)
|
88 |
+
db.commit()
|
89 |
+
|
90 |
+
# Save assistant message with document references
|
91 |
+
document_refs = json.dumps([doc.get('metadata', {}).get('document_id') for doc in context_documents])
|
92 |
+
assistant_message = ChatMessage(
|
93 |
+
session_id=request.session_id,
|
94 |
+
message_type="assistant",
|
95 |
+
content=ai_response["answer"],
|
96 |
+
document_references=document_refs
|
97 |
+
)
|
98 |
+
db.add(assistant_message)
|
99 |
+
db.commit()
|
100 |
+
|
101 |
+
return ChatResponse(
|
102 |
+
success=ai_response["success"],
|
103 |
+
answer=ai_response["answer"],
|
104 |
+
model=ai_response.get("model"),
|
105 |
+
sources=ai_response.get("sources", []),
|
106 |
+
session_id=request.session_id,
|
107 |
+
message_id=assistant_message.id
|
108 |
+
)
|
109 |
+
|
110 |
+
except HTTPException:
|
111 |
+
raise
|
112 |
+
except Exception as e:
|
113 |
+
raise HTTPException(status_code=500, detail=f"Error processing chat request: {str(e)}")
|
114 |
+
|
115 |
+
|
116 |
+
@router.get("/history/{session_id}", response_model=ChatHistoryResponse)
|
117 |
+
def get_chat_history(
|
118 |
+
session_id: str,
|
119 |
+
skip: int = 0,
|
120 |
+
limit: int = 50,
|
121 |
+
db: Session = Depends(get_db)
|
122 |
+
):
|
123 |
+
"""Get chat history for a specific session"""
|
124 |
+
try:
|
125 |
+
messages = db.query(ChatMessage).filter(
|
126 |
+
ChatMessage.session_id == session_id
|
127 |
+
).order_by(ChatMessage.created_at.asc()).offset(skip).limit(limit).all()
|
128 |
+
|
129 |
+
total = db.query(ChatMessage).filter(
|
130 |
+
ChatMessage.session_id == session_id
|
131 |
+
).count()
|
132 |
+
|
133 |
+
return ChatHistoryResponse(
|
134 |
+
messages=[ChatMessageResponse.from_orm(msg) for msg in messages],
|
135 |
+
total=total
|
136 |
+
)
|
137 |
+
except Exception as e:
|
138 |
+
raise HTTPException(status_code=500, detail=f"Error retrieving chat history: {str(e)}")
|
139 |
+
|
140 |
+
|
141 |
+
@router.post("/session/new")
|
142 |
+
def create_new_session():
|
143 |
+
"""Create a new chat session"""
|
144 |
+
try:
|
145 |
+
session_id = str(uuid.uuid4())
|
146 |
+
return {"session_id": session_id}
|
147 |
+
except Exception as e:
|
148 |
+
raise HTTPException(status_code=500, detail=f"Error creating session: {str(e)}")
|
149 |
+
|
150 |
+
|
151 |
+
@router.get("/sessions")
|
152 |
+
def list_sessions(db: Session = Depends(get_db)):
|
153 |
+
"""List all chat sessions"""
|
154 |
+
try:
|
155 |
+
# Get unique session IDs with message counts
|
156 |
+
sessions = db.query(
|
157 |
+
ChatMessage.session_id,
|
158 |
+
db.func.count(ChatMessage.id).label('message_count'),
|
159 |
+
db.func.max(ChatMessage.created_at).label('last_message_at')
|
160 |
+
).group_by(ChatMessage.session_id).order_by(
|
161 |
+
db.func.max(ChatMessage.created_at).desc()
|
162 |
+
).all()
|
163 |
+
|
164 |
+
return [
|
165 |
+
{
|
166 |
+
"session_id": session.session_id,
|
167 |
+
"message_count": session.message_count,
|
168 |
+
"last_message_at": session.last_message_at
|
169 |
+
}
|
170 |
+
for session in sessions
|
171 |
+
]
|
172 |
+
except Exception as e:
|
173 |
+
raise HTTPException(status_code=500, detail=f"Error retrieving sessions: {str(e)}")
|
174 |
+
|
175 |
+
|
176 |
+
@router.delete("/session/{session_id}")
|
177 |
+
def delete_session(session_id: str, db: Session = Depends(get_db)):
|
178 |
+
"""Delete a chat session and all its messages"""
|
179 |
+
try:
|
180 |
+
messages = db.query(ChatMessage).filter(
|
181 |
+
ChatMessage.session_id == session_id
|
182 |
+
).all()
|
183 |
+
|
184 |
+
for message in messages:
|
185 |
+
db.delete(message)
|
186 |
+
|
187 |
+
db.commit()
|
188 |
+
|
189 |
+
return {"success": True, "message": f"Session {session_id} deleted successfully"}
|
190 |
+
except Exception as e:
|
191 |
+
raise HTTPException(status_code=500, detail=f"Error deleting session: {str(e)}")
|
192 |
+
|
193 |
+
|
194 |
+
@router.get("/models/available")
|
195 |
+
def get_available_models():
|
196 |
+
"""Get list of available AI models"""
|
197 |
+
try:
|
198 |
+
models = ai_service.get_available_models()
|
199 |
+
return {
|
200 |
+
"available_models": models,
|
201 |
+
"is_configured": ai_service.is_configured()
|
202 |
+
}
|
203 |
+
except Exception as e:
|
204 |
+
raise HTTPException(status_code=500, detail=f"Error retrieving models: {str(e)}")
|
backend/app/api/endpoints/documents.py
ADDED
@@ -0,0 +1,216 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from fastapi import APIRouter, Depends, HTTPException, UploadFile, File
|
2 |
+
from sqlalchemy.orm import Session
|
3 |
+
from typing import List
|
4 |
+
import os
|
5 |
+
import uuid
|
6 |
+
import aiofiles
|
7 |
+
from datetime import datetime
|
8 |
+
|
9 |
+
from app.core.database import get_db
|
10 |
+
from app.core.config import settings
|
11 |
+
from app.models.document import Document
|
12 |
+
from app.schemas.document import DocumentResponse, DocumentListResponse, DocumentDeleteResponse, UploadResponse
|
13 |
+
from app.services.pdf_processor import PDFProcessor
|
14 |
+
from app.services.vector_store import VectorStore
|
15 |
+
from app.models.document import ChatMessage
|
16 |
+
import shutil
|
17 |
+
|
18 |
+
router = APIRouter()
|
19 |
+
pdf_processor = PDFProcessor()
|
20 |
+
vector_store = VectorStore()
|
21 |
+
|
22 |
+
|
23 |
+
@router.post("/upload", response_model=UploadResponse)
|
24 |
+
async def upload_document(
|
25 |
+
file: UploadFile = File(...),
|
26 |
+
db: Session = Depends(get_db)
|
27 |
+
):
|
28 |
+
"""Upload and process a PDF document"""
|
29 |
+
try:
|
30 |
+
# Restrict to 3 documents max
|
31 |
+
doc_count = db.query(Document).count()
|
32 |
+
if doc_count >= 3:
|
33 |
+
raise HTTPException(status_code=400, detail="You can only upload up to 3 documents.")
|
34 |
+
# Validate file type
|
35 |
+
if not file.filename.lower().endswith('.pdf'):
|
36 |
+
raise HTTPException(status_code=400, detail="Only PDF files are allowed")
|
37 |
+
|
38 |
+
# Generate unique filename
|
39 |
+
file_extension = os.path.splitext(file.filename)[1]
|
40 |
+
unique_filename = f"{uuid.uuid4()}{file_extension}"
|
41 |
+
file_path = os.path.join(settings.UPLOAD_DIR, unique_filename)
|
42 |
+
|
43 |
+
# Save file
|
44 |
+
async with aiofiles.open(file_path, 'wb') as f:
|
45 |
+
content = await file.read()
|
46 |
+
await f.write(content)
|
47 |
+
|
48 |
+
# Process PDF
|
49 |
+
success, text_content, metadata = pdf_processor.process_pdf(file_path)
|
50 |
+
|
51 |
+
if not success:
|
52 |
+
# Clean up file if processing failed
|
53 |
+
if os.path.exists(file_path):
|
54 |
+
os.remove(file_path)
|
55 |
+
raise HTTPException(status_code=400, detail=text_content)
|
56 |
+
|
57 |
+
# Create document record
|
58 |
+
db_document = Document(
|
59 |
+
filename=unique_filename,
|
60 |
+
original_filename=file.filename,
|
61 |
+
file_path=file_path,
|
62 |
+
file_size=len(content),
|
63 |
+
content=text_content,
|
64 |
+
processed=True
|
65 |
+
)
|
66 |
+
|
67 |
+
db.add(db_document)
|
68 |
+
db.commit()
|
69 |
+
db.refresh(db_document)
|
70 |
+
|
71 |
+
# Add to vector store
|
72 |
+
print(f"Adding document {db_document.id} to vector store...")
|
73 |
+
vector_success = vector_store.add_document(
|
74 |
+
str(db_document.id),
|
75 |
+
text_content,
|
76 |
+
metadata={
|
77 |
+
"filename": file.filename,
|
78 |
+
"file_size": len(content),
|
79 |
+
"num_pages": metadata.get('num_pages', 0)
|
80 |
+
}
|
81 |
+
)
|
82 |
+
|
83 |
+
if not vector_success:
|
84 |
+
# Log warning but don't fail the upload
|
85 |
+
print(f"Warning: Failed to add document {db_document.id} to vector store")
|
86 |
+
else:
|
87 |
+
print(f"Successfully added document {db_document.id} to vector store")
|
88 |
+
# Check collection stats
|
89 |
+
stats = vector_store.get_collection_stats()
|
90 |
+
print(f"Vector store stats: {stats}")
|
91 |
+
|
92 |
+
return UploadResponse(
|
93 |
+
success=True,
|
94 |
+
document=DocumentResponse.from_orm(db_document),
|
95 |
+
message="Document uploaded and processed successfully"
|
96 |
+
)
|
97 |
+
|
98 |
+
except HTTPException:
|
99 |
+
raise
|
100 |
+
except Exception as e:
|
101 |
+
# Clean up file if something went wrong
|
102 |
+
if 'file_path' in locals() and os.path.exists(file_path):
|
103 |
+
os.remove(file_path)
|
104 |
+
raise HTTPException(status_code=500, detail=f"Error uploading document: {str(e)}")
|
105 |
+
|
106 |
+
|
107 |
+
@router.get("/", response_model=DocumentListResponse)
|
108 |
+
def list_documents(
|
109 |
+
skip: int = 0,
|
110 |
+
limit: int = 100,
|
111 |
+
db: Session = Depends(get_db)
|
112 |
+
):
|
113 |
+
"""List all uploaded documents"""
|
114 |
+
try:
|
115 |
+
documents = db.query(Document).offset(skip).limit(limit).all()
|
116 |
+
total = db.query(Document).count()
|
117 |
+
|
118 |
+
return DocumentListResponse(
|
119 |
+
documents=[DocumentResponse.from_orm(doc) for doc in documents],
|
120 |
+
total=total
|
121 |
+
)
|
122 |
+
except Exception as e:
|
123 |
+
raise HTTPException(status_code=500, detail=f"Error retrieving documents: {str(e)}")
|
124 |
+
|
125 |
+
|
126 |
+
@router.get("/{document_id}", response_model=DocumentResponse)
|
127 |
+
def get_document(document_id: int, db: Session = Depends(get_db)):
|
128 |
+
"""Get a specific document by ID"""
|
129 |
+
try:
|
130 |
+
document = db.query(Document).filter(Document.id == document_id).first()
|
131 |
+
if not document:
|
132 |
+
raise HTTPException(status_code=404, detail="Document not found")
|
133 |
+
|
134 |
+
return DocumentResponse.from_orm(document)
|
135 |
+
except HTTPException:
|
136 |
+
raise
|
137 |
+
except Exception as e:
|
138 |
+
raise HTTPException(status_code=500, detail=f"Error retrieving document: {str(e)}")
|
139 |
+
|
140 |
+
|
141 |
+
@router.delete("/{document_id}", response_model=DocumentDeleteResponse)
|
142 |
+
def delete_document(document_id: int, db: Session = Depends(get_db)):
|
143 |
+
"""Delete a document and its vector embeddings"""
|
144 |
+
try:
|
145 |
+
document = db.query(Document).filter(Document.id == document_id).first()
|
146 |
+
if not document:
|
147 |
+
raise HTTPException(status_code=404, detail="Document not found")
|
148 |
+
|
149 |
+
# Delete from vector store
|
150 |
+
vector_store.delete_document(str(document_id))
|
151 |
+
|
152 |
+
# Delete file from filesystem
|
153 |
+
if os.path.exists(document.file_path):
|
154 |
+
os.remove(document.file_path)
|
155 |
+
|
156 |
+
# Delete from database
|
157 |
+
db.delete(document)
|
158 |
+
db.commit()
|
159 |
+
|
160 |
+
return DocumentDeleteResponse(
|
161 |
+
success=True,
|
162 |
+
message=f"Document {document.original_filename} deleted successfully"
|
163 |
+
)
|
164 |
+
except HTTPException:
|
165 |
+
raise
|
166 |
+
except Exception as e:
|
167 |
+
raise HTTPException(status_code=500, detail=f"Error deleting document: {str(e)}")
|
168 |
+
|
169 |
+
|
170 |
+
@router.post("/clear_all")
|
171 |
+
def clear_all_data(db: Session = Depends(get_db)):
|
172 |
+
"""Admin endpoint to clear all documents, chat messages, uploaded files, and vector store."""
|
173 |
+
try:
|
174 |
+
# Delete all documents and chat messages from DB
|
175 |
+
db.query(Document).delete()
|
176 |
+
db.query(ChatMessage).delete()
|
177 |
+
db.commit()
|
178 |
+
# Delete all files in uploads directory
|
179 |
+
upload_dir = settings.UPLOAD_DIR
|
180 |
+
for filename in os.listdir(upload_dir):
|
181 |
+
file_path = os.path.join(upload_dir, filename)
|
182 |
+
try:
|
183 |
+
if os.path.isfile(file_path) or os.path.islink(file_path):
|
184 |
+
os.unlink(file_path)
|
185 |
+
elif os.path.isdir(file_path):
|
186 |
+
shutil.rmtree(file_path)
|
187 |
+
except Exception as e:
|
188 |
+
print(f"Failed to delete {file_path}: {e}")
|
189 |
+
# Clear ChromaDB vector store using the singleton
|
190 |
+
vector_store.clear_all()
|
191 |
+
return {"success": True, "message": "All documents, chat messages, uploads, and vectors cleared."}
|
192 |
+
except Exception as e:
|
193 |
+
return {"success": False, "message": f"Error clearing data: {str(e)}"}
|
194 |
+
|
195 |
+
|
196 |
+
@router.get("/stats/summary")
|
197 |
+
def get_document_stats(db: Session = Depends(get_db)):
|
198 |
+
"""Get document statistics"""
|
199 |
+
try:
|
200 |
+
total_documents = db.query(Document).count()
|
201 |
+
processed_documents = db.query(Document).filter(Document.processed == True).count()
|
202 |
+
total_size = db.query(Document).with_entities(
|
203 |
+
db.func.sum(Document.file_size)
|
204 |
+
).scalar() or 0
|
205 |
+
|
206 |
+
vector_stats = vector_store.get_collection_stats()
|
207 |
+
|
208 |
+
return {
|
209 |
+
"total_documents": total_documents,
|
210 |
+
"processed_documents": processed_documents,
|
211 |
+
"total_size_bytes": total_size,
|
212 |
+
"total_size_mb": round(total_size / (1024 * 1024), 2),
|
213 |
+
"vector_store_chunks": vector_stats.get("total_documents", 0)
|
214 |
+
}
|
215 |
+
except Exception as e:
|
216 |
+
raise HTTPException(status_code=500, detail=f"Error retrieving stats: {str(e)}")
|
backend/app/core/__init__.py
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
# Core package
|
backend/app/core/config.py
ADDED
@@ -0,0 +1,47 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from pydantic_settings import BaseSettings
|
2 |
+
from typing import Optional
|
3 |
+
import os
|
4 |
+
|
5 |
+
|
6 |
+
class Settings(BaseSettings):
|
7 |
+
# API Configuration
|
8 |
+
API_V1_STR: str = "/api/v1"
|
9 |
+
PROJECT_NAME: str = "PDF Q&A Chatbot"
|
10 |
+
|
11 |
+
# Security
|
12 |
+
SECRET_KEY: str = "your-secret-key-here"
|
13 |
+
ACCESS_TOKEN_EXPIRE_MINUTES: int = 60 * 24 * 8 # 8 days
|
14 |
+
|
15 |
+
# Database
|
16 |
+
DATABASE_URL: str = "sqlite:///./pdf_chatbot.db"
|
17 |
+
|
18 |
+
# Vector Database
|
19 |
+
CHROMA_PERSIST_DIRECTORY: str = "./chroma_db"
|
20 |
+
|
21 |
+
# AI Providers
|
22 |
+
OPENROUTER_API_KEY: Optional[str] = None
|
23 |
+
ANTHROPIC_API_KEY: Optional[str] = None
|
24 |
+
|
25 |
+
# File Storage
|
26 |
+
UPLOAD_DIR: str = "./uploads"
|
27 |
+
MAX_FILE_SIZE: int = 10 * 1024 * 1024 # 10MB
|
28 |
+
ALLOWED_EXTENSIONS: list = [".pdf"]
|
29 |
+
|
30 |
+
# CORS
|
31 |
+
BACKEND_CORS_ORIGINS: list = [
|
32 |
+
"http://localhost:3000",
|
33 |
+
"http://localhost:3001",
|
34 |
+
"http://127.0.0.1:3000",
|
35 |
+
"http://127.0.0.1:3001",
|
36 |
+
]
|
37 |
+
|
38 |
+
class Config:
|
39 |
+
env_file = ".env"
|
40 |
+
case_sensitive = True
|
41 |
+
|
42 |
+
|
43 |
+
settings = Settings()
|
44 |
+
|
45 |
+
# Ensure upload directory exists
|
46 |
+
os.makedirs(settings.UPLOAD_DIR, exist_ok=True)
|
47 |
+
os.makedirs(settings.CHROMA_PERSIST_DIRECTORY, exist_ok=True)
|
backend/app/core/database.py
ADDED
@@ -0,0 +1,29 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from sqlalchemy import create_engine
|
2 |
+
from sqlalchemy.orm import sessionmaker, Session
|
3 |
+
from sqlalchemy.ext.declarative import declarative_base
|
4 |
+
from app.core.config import settings
|
5 |
+
|
6 |
+
engine = create_engine(
|
7 |
+
settings.DATABASE_URL,
|
8 |
+
connect_args={"check_same_thread": False} if "sqlite" in settings.DATABASE_URL else {}
|
9 |
+
)
|
10 |
+
|
11 |
+
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
|
12 |
+
|
13 |
+
Base = declarative_base()
|
14 |
+
|
15 |
+
# Import models to ensure they are registered
|
16 |
+
from app.models.document import Document, ChatMessage
|
17 |
+
|
18 |
+
|
19 |
+
def get_db():
|
20 |
+
db = SessionLocal()
|
21 |
+
try:
|
22 |
+
yield db
|
23 |
+
finally:
|
24 |
+
db.close()
|
25 |
+
|
26 |
+
|
27 |
+
def create_tables():
|
28 |
+
"""Create all tables in the database"""
|
29 |
+
Base.metadata.create_all(bind=engine)
|
backend/app/models/__init__.py
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
# Models package
|
backend/app/models/document.py
ADDED
@@ -0,0 +1,35 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from sqlalchemy import Column, Integer, String, DateTime, Text, Boolean
|
2 |
+
from sqlalchemy.sql import func
|
3 |
+
from datetime import datetime
|
4 |
+
from app.core.database import Base
|
5 |
+
|
6 |
+
|
7 |
+
class Document(Base):
|
8 |
+
__tablename__ = "documents"
|
9 |
+
|
10 |
+
id = Column(Integer, primary_key=True, index=True)
|
11 |
+
filename = Column(String(255), nullable=False)
|
12 |
+
original_filename = Column(String(255), nullable=False)
|
13 |
+
file_path = Column(String(500), nullable=False)
|
14 |
+
file_size = Column(Integer, nullable=False)
|
15 |
+
content = Column(Text, nullable=True)
|
16 |
+
processed = Column(Boolean, default=False)
|
17 |
+
created_at = Column(DateTime(timezone=True), server_default=func.now())
|
18 |
+
updated_at = Column(DateTime(timezone=True), onupdate=func.now())
|
19 |
+
|
20 |
+
def __repr__(self):
|
21 |
+
return f"<Document(id={self.id}, filename='{self.filename}')>"
|
22 |
+
|
23 |
+
|
24 |
+
class ChatMessage(Base):
|
25 |
+
__tablename__ = "chat_messages"
|
26 |
+
|
27 |
+
id = Column(Integer, primary_key=True, index=True)
|
28 |
+
session_id = Column(String(255), nullable=False, index=True)
|
29 |
+
message_type = Column(String(20), nullable=False) # 'user' or 'assistant'
|
30 |
+
content = Column(Text, nullable=False)
|
31 |
+
document_references = Column(Text, nullable=True) # JSON string of referenced documents
|
32 |
+
created_at = Column(DateTime(timezone=True), server_default=func.now())
|
33 |
+
|
34 |
+
def __repr__(self):
|
35 |
+
return f"<ChatMessage(id={self.id}, session_id='{self.session_id}', type='{self.message_type}')>"
|
backend/app/schemas/__init__.py
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
# Schemas package
|
backend/app/schemas/chat.py
ADDED
@@ -0,0 +1,50 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from pydantic import BaseModel
|
2 |
+
from typing import Optional, List, Dict
|
3 |
+
from datetime import datetime
|
4 |
+
|
5 |
+
|
6 |
+
class ChatMessageBase(BaseModel):
|
7 |
+
content: str
|
8 |
+
message_type: str # 'user' or 'assistant'
|
9 |
+
|
10 |
+
|
11 |
+
class ChatMessageCreate(ChatMessageBase):
|
12 |
+
session_id: str
|
13 |
+
document_references: Optional[str] = None
|
14 |
+
|
15 |
+
|
16 |
+
class ChatMessageResponse(ChatMessageBase):
|
17 |
+
id: int
|
18 |
+
session_id: str
|
19 |
+
document_references: Optional[str] = None
|
20 |
+
created_at: datetime
|
21 |
+
|
22 |
+
class Config:
|
23 |
+
from_attributes = True
|
24 |
+
|
25 |
+
|
26 |
+
class ChatRequest(BaseModel):
|
27 |
+
question: str
|
28 |
+
session_id: str
|
29 |
+
model: Optional[str] = "auto"
|
30 |
+
document_id: Optional[int] = None
|
31 |
+
|
32 |
+
|
33 |
+
class ChatResponse(BaseModel):
|
34 |
+
success: bool
|
35 |
+
answer: str
|
36 |
+
model: Optional[str] = None
|
37 |
+
sources: List[str] = []
|
38 |
+
session_id: str
|
39 |
+
message_id: Optional[int] = None
|
40 |
+
|
41 |
+
|
42 |
+
class ChatHistoryResponse(BaseModel):
|
43 |
+
messages: List[ChatMessageResponse]
|
44 |
+
total: int
|
45 |
+
|
46 |
+
|
47 |
+
class ChatSessionResponse(BaseModel):
|
48 |
+
session_id: str
|
49 |
+
message_count: int
|
50 |
+
last_message_at: Optional[datetime] = None
|
backend/app/schemas/document.py
ADDED
@@ -0,0 +1,40 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from pydantic import BaseModel
|
2 |
+
from typing import Optional, List
|
3 |
+
from datetime import datetime
|
4 |
+
|
5 |
+
|
6 |
+
class DocumentBase(BaseModel):
|
7 |
+
filename: str
|
8 |
+
original_filename: str
|
9 |
+
file_size: int
|
10 |
+
|
11 |
+
|
12 |
+
class DocumentCreate(DocumentBase):
|
13 |
+
file_path: str
|
14 |
+
|
15 |
+
|
16 |
+
class DocumentResponse(DocumentBase):
|
17 |
+
id: int
|
18 |
+
content: Optional[str] = None
|
19 |
+
processed: bool
|
20 |
+
created_at: datetime
|
21 |
+
updated_at: Optional[datetime] = None
|
22 |
+
|
23 |
+
class Config:
|
24 |
+
from_attributes = True
|
25 |
+
|
26 |
+
|
27 |
+
class DocumentListResponse(BaseModel):
|
28 |
+
documents: List[DocumentResponse]
|
29 |
+
total: int
|
30 |
+
|
31 |
+
|
32 |
+
class DocumentDeleteResponse(BaseModel):
|
33 |
+
success: bool
|
34 |
+
message: str
|
35 |
+
|
36 |
+
|
37 |
+
class UploadResponse(BaseModel):
|
38 |
+
success: bool
|
39 |
+
document: Optional[DocumentResponse] = None
|
40 |
+
message: str
|
backend/app/services/__init__.py
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
# Services package
|
backend/app/services/ai_service.py
ADDED
@@ -0,0 +1,267 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import openai
|
2 |
+
import anthropic
|
3 |
+
from typing import List, Dict, Optional, Union
|
4 |
+
import logging
|
5 |
+
from app.core.config import settings
|
6 |
+
import os
|
7 |
+
import httpx
|
8 |
+
|
9 |
+
logger = logging.getLogger(__name__)
|
10 |
+
|
11 |
+
|
12 |
+
class AIService:
|
13 |
+
def __init__(self):
|
14 |
+
self.openrouter_api_key = settings.OPENROUTER_API_KEY
|
15 |
+
self.anthropic_client = None
|
16 |
+
|
17 |
+
# Initialize OpenRouter (OpenAI-compatible) client
|
18 |
+
if self.openrouter_api_key:
|
19 |
+
# Validate API key format
|
20 |
+
if not self.openrouter_api_key.startswith('sk-or-'):
|
21 |
+
logger.warning("OpenRouter API key doesn't start with 'sk-or-'. This might cause issues.")
|
22 |
+
|
23 |
+
openai.api_key = self.openrouter_api_key
|
24 |
+
openai.base_url = "https://openrouter.ai/api/v1"
|
25 |
+
os.environ["OPENAI_API_KEY"] = self.openrouter_api_key
|
26 |
+
os.environ["OPENAI_BASE_URL"] = "https://openrouter.ai/api/v1"
|
27 |
+
logger.info("OpenRouter API key configured")
|
28 |
+
else:
|
29 |
+
logger.warning("No OpenRouter API key found")
|
30 |
+
|
31 |
+
# Initialize Anthropic client
|
32 |
+
if settings.ANTHROPIC_API_KEY:
|
33 |
+
self.anthropic_client = anthropic.Anthropic(api_key=settings.ANTHROPIC_API_KEY)
|
34 |
+
logger.info("Anthropic API key configured")
|
35 |
+
else:
|
36 |
+
logger.warning("No Anthropic API key found")
|
37 |
+
|
38 |
+
def generate_answer(self, question: str, context_documents: List[Dict], model: str = "auto") -> Dict:
|
39 |
+
"""Generate answer based on question and context documents"""
|
40 |
+
try:
|
41 |
+
# Prepare context from documents
|
42 |
+
context = self._prepare_context(context_documents)
|
43 |
+
# Collect unique document IDs from context_documents
|
44 |
+
used_doc_ids = list({str(doc.get('metadata', {}).get('document_id')) for doc in context_documents if doc.get('metadata', {}).get('document_id') is not None})
|
45 |
+
# Choose model based on availability
|
46 |
+
if model == "auto":
|
47 |
+
if self.openrouter_api_key:
|
48 |
+
model = "openrouter"
|
49 |
+
elif self.anthropic_client:
|
50 |
+
model = "anthropic"
|
51 |
+
else:
|
52 |
+
return {
|
53 |
+
"success": False,
|
54 |
+
"answer": "No AI service configured. Please set up OpenRouter or Anthropic API keys.",
|
55 |
+
"sources": []
|
56 |
+
}
|
57 |
+
if model == "openrouter" and self.openrouter_api_key:
|
58 |
+
ai_result = self._generate_openrouter_answer(question, context)
|
59 |
+
elif model == "anthropic" and self.anthropic_client:
|
60 |
+
ai_result = self._generate_anthropic_answer(question, context)
|
61 |
+
else:
|
62 |
+
return {
|
63 |
+
"success": False,
|
64 |
+
"answer": f"Model {model} not available or not configured.",
|
65 |
+
"sources": []
|
66 |
+
}
|
67 |
+
# Always use the actual doc IDs from context_documents for sources
|
68 |
+
ai_result["sources"] = used_doc_ids
|
69 |
+
return ai_result
|
70 |
+
except Exception as e:
|
71 |
+
logger.error(f"Error generating answer: {e}")
|
72 |
+
return {
|
73 |
+
"success": False,
|
74 |
+
"answer": f"Error generating answer: {str(e)}",
|
75 |
+
"sources": []
|
76 |
+
}
|
77 |
+
|
78 |
+
def _prepare_context(self, context_documents: List[Dict]) -> str:
|
79 |
+
"""Prepare context string from document chunks"""
|
80 |
+
if not context_documents:
|
81 |
+
return ""
|
82 |
+
context_parts = []
|
83 |
+
for doc in context_documents:
|
84 |
+
content = doc.get('content', '')
|
85 |
+
metadata = doc.get('metadata', {})
|
86 |
+
similarity = doc.get('similarity_score', 0)
|
87 |
+
doc_id = metadata.get('document_id', 'unknown')
|
88 |
+
chunk_index = metadata.get('chunk_index')
|
89 |
+
# Use real document ID as the main label
|
90 |
+
doc_info = f"Document {doc_id} (Relevance: {similarity:.2f})"
|
91 |
+
if chunk_index is not None:
|
92 |
+
doc_info += f" - Chunk: {chunk_index}"
|
93 |
+
context_parts.append(f"{doc_info}:\n{content}\n")
|
94 |
+
return "\n".join(context_parts)
|
95 |
+
|
96 |
+
def _generate_openrouter_answer(self, question: str, context: str) -> Dict:
|
97 |
+
"""Generate answer using OpenRouter API via HTTPX"""
|
98 |
+
try:
|
99 |
+
system_prompt = """You are a helpful AI assistant that answers questions based on provided document context.
|
100 |
+
Follow these guidelines:
|
101 |
+
1. Only answer based on the information provided in the context
|
102 |
+
2. If the context doesn't contain enough information to answer the question, say so
|
103 |
+
3. Be concise but comprehensive
|
104 |
+
4. Cite specific parts of the documents when possible
|
105 |
+
5. If you're unsure about something, acknowledge the uncertainty
|
106 |
+
6. Format your response clearly and professionally using proper markdown formatting:\n - Use **bold** for important points and headings\n - Use bullet points (β’) for lists\n - Use numbered lists for step-by-step instructions\n - Use proper paragraph breaks for readability\n - Structure your response with clear sections when appropriate\n - Use blockquotes for important quotes or key information\n7. You must answer the user's question as directly as possible, using only the information in the context. Do not simply summarize the context. If the context does not contain an answer, say so."""
|
107 |
+
|
108 |
+
user_prompt = f"""Context from documents:\n{context}\n\nQuestion: {question}\n\nPlease provide a comprehensive answer based on the context above."""
|
109 |
+
|
110 |
+
headers = {
|
111 |
+
"Authorization": f"Bearer {self.openrouter_api_key}",
|
112 |
+
"HTTP-Referer": "http://localhost:3000",
|
113 |
+
"X-Title": "PDF Q&A Chatbot",
|
114 |
+
"Content-Type": "application/json"
|
115 |
+
}
|
116 |
+
payload = {
|
117 |
+
"model": "meta-llama/llama-3-70b-instruct",
|
118 |
+
"messages": [
|
119 |
+
{"role": "system", "content": system_prompt},
|
120 |
+
{"role": "user", "content": user_prompt}
|
121 |
+
],
|
122 |
+
"max_tokens": 1000,
|
123 |
+
"temperature": 0.3
|
124 |
+
}
|
125 |
+
url = "https://openrouter.ai/api/v1/chat/completions"
|
126 |
+
|
127 |
+
logger.info(f"Making OpenRouter API request to: {url}")
|
128 |
+
logger.info(f"Request payload: {payload}")
|
129 |
+
|
130 |
+
with httpx.Client(timeout=60) as client:
|
131 |
+
resp = client.post(url, headers=headers, json=payload)
|
132 |
+
|
133 |
+
logger.info(f"OpenRouter API response status: {resp.status_code}")
|
134 |
+
logger.info(f"OpenRouter API response headers: {dict(resp.headers)}")
|
135 |
+
|
136 |
+
# Log the raw response for debugging
|
137 |
+
response_text = resp.text
|
138 |
+
logger.info(f"OpenRouter API raw response: {response_text[:500]}...")
|
139 |
+
|
140 |
+
if not response_text.strip():
|
141 |
+
logger.error("OpenRouter API returned empty response")
|
142 |
+
return {
|
143 |
+
"success": False,
|
144 |
+
"answer": "OpenRouter API returned empty response. Please check your API key and try again.",
|
145 |
+
"sources": []
|
146 |
+
}
|
147 |
+
|
148 |
+
resp.raise_for_status()
|
149 |
+
|
150 |
+
try:
|
151 |
+
data = resp.json()
|
152 |
+
logger.info(f"OpenRouter API parsed response: {data}")
|
153 |
+
except Exception as e:
|
154 |
+
logger.error(f"OpenRouter API non-JSON response: {response_text}")
|
155 |
+
logger.error(f"JSON parsing error: {e}")
|
156 |
+
return {
|
157 |
+
"success": False,
|
158 |
+
"answer": f"OpenRouter API returned invalid JSON response: {str(e)}",
|
159 |
+
"sources": []
|
160 |
+
}
|
161 |
+
|
162 |
+
if "choices" not in data or not data["choices"]:
|
163 |
+
logger.error(f"OpenRouter API response missing choices: {data}")
|
164 |
+
return {
|
165 |
+
"success": False,
|
166 |
+
"answer": "OpenRouter API response missing choices field",
|
167 |
+
"sources": []
|
168 |
+
}
|
169 |
+
|
170 |
+
answer = data["choices"][0]["message"]["content"].strip()
|
171 |
+
|
172 |
+
return {
|
173 |
+
"success": True,
|
174 |
+
"answer": answer,
|
175 |
+
"model": "openrouter/meta-llama/llama-3-70b-instruct",
|
176 |
+
"sources": [] # Always set to empty, will be overwritten by generate_answer
|
177 |
+
}
|
178 |
+
except httpx.HTTPStatusError as e:
|
179 |
+
logger.error(f"OpenRouter API HTTP error: {e.response.status_code} - {e.response.text}")
|
180 |
+
return {
|
181 |
+
"success": False,
|
182 |
+
"answer": f"OpenRouter API HTTP error: {e.response.status_code} - {e.response.text}",
|
183 |
+
"sources": []
|
184 |
+
}
|
185 |
+
except Exception as e:
|
186 |
+
logger.error(f"OpenRouter API error: {e}")
|
187 |
+
return {
|
188 |
+
"success": False,
|
189 |
+
"answer": f"Error calling OpenRouter API: {str(e)}",
|
190 |
+
"sources": []
|
191 |
+
}
|
192 |
+
|
193 |
+
def _generate_anthropic_answer(self, question: str, context: str) -> Dict:
|
194 |
+
"""Generate answer using Anthropic Claude API"""
|
195 |
+
try:
|
196 |
+
if not self.anthropic_client:
|
197 |
+
return {
|
198 |
+
"success": False,
|
199 |
+
"answer": "Anthropic client not configured",
|
200 |
+
"sources": []
|
201 |
+
}
|
202 |
+
|
203 |
+
system_prompt = """You are a helpful AI assistant that answers questions based on provided document context.
|
204 |
+
Follow these guidelines:
|
205 |
+
1. Only answer based on the information provided in the context
|
206 |
+
2. If the context doesn't contain enough information to answer the question, say so
|
207 |
+
3. Be concise but comprehensive
|
208 |
+
4. Cite specific parts of the documents when possible
|
209 |
+
5. If you're unsure about something, acknowledge the uncertainty
|
210 |
+
6. Format your response clearly and professionally using proper markdown formatting:\n - Use **bold** for important points and headings\n - Use bullet points (β’) for lists\n - Use numbered lists for step-by-step instructions\n - Use proper paragraph breaks for readability\n - Structure your response with clear sections when appropriate\n - Use blockquotes for important quotes or key information\n7. You must answer the user's question as directly as possible, using only the information in the context. Do not simply summarize the context. If the context does not contain an answer, say so."""
|
211 |
+
|
212 |
+
user_prompt = f"""Context from documents:\n{context}\n\nQuestion: {question}\n\nPlease provide a comprehensive answer based on the context above."""
|
213 |
+
|
214 |
+
response = self.anthropic_client.messages.create(
|
215 |
+
model="claude-3-sonnet-20240229",
|
216 |
+
max_tokens=1000,
|
217 |
+
temperature=0.3,
|
218 |
+
system=system_prompt,
|
219 |
+
messages=[
|
220 |
+
{"role": "user", "content": user_prompt}
|
221 |
+
]
|
222 |
+
)
|
223 |
+
|
224 |
+
answer = response.content[0].text.strip()
|
225 |
+
|
226 |
+
return {
|
227 |
+
"success": True,
|
228 |
+
"answer": answer,
|
229 |
+
"model": "claude-3-sonnet-20240229",
|
230 |
+
"sources": [] # Always set to empty, will be overwritten by generate_answer
|
231 |
+
}
|
232 |
+
|
233 |
+
except Exception as e:
|
234 |
+
logger.error(f"Anthropic API error: {e}")
|
235 |
+
return {
|
236 |
+
"success": False,
|
237 |
+
"answer": f"Error calling Anthropic API: {str(e)}",
|
238 |
+
"sources": []
|
239 |
+
}
|
240 |
+
|
241 |
+
def _extract_sources_from_context(self, context: str) -> List[str]:
|
242 |
+
"""Extract source information from context"""
|
243 |
+
sources = set() # Use set to avoid duplicates
|
244 |
+
lines = context.split('\n')
|
245 |
+
|
246 |
+
for line in lines:
|
247 |
+
if line.startswith('Document') and 'ID:' in line:
|
248 |
+
# Extract document ID
|
249 |
+
parts = line.split('ID:')
|
250 |
+
if len(parts) > 1:
|
251 |
+
doc_id = parts[1].strip().split()[0]
|
252 |
+
sources.add(f"Document ID: {doc_id}")
|
253 |
+
|
254 |
+
return list(sources) # Convert back to list
|
255 |
+
|
256 |
+
def get_available_models(self) -> List[str]:
|
257 |
+
"""Get list of available AI models"""
|
258 |
+
models = []
|
259 |
+
if self.openrouter_api_key:
|
260 |
+
models.append("openrouter")
|
261 |
+
if self.anthropic_client:
|
262 |
+
models.append("anthropic")
|
263 |
+
return models
|
264 |
+
|
265 |
+
def is_configured(self) -> bool:
|
266 |
+
"""Check if any AI service is configured"""
|
267 |
+
return bool(self.openrouter_api_key or self.anthropic_client)
|
backend/app/services/pdf_processor.py
ADDED
@@ -0,0 +1,105 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import PyPDF2
|
2 |
+
import os
|
3 |
+
from typing import Optional, Tuple
|
4 |
+
from app.core.config import settings
|
5 |
+
import logging
|
6 |
+
|
7 |
+
logger = logging.getLogger(__name__)
|
8 |
+
|
9 |
+
|
10 |
+
class PDFProcessor:
|
11 |
+
def __init__(self):
|
12 |
+
self.allowed_extensions = settings.ALLOWED_EXTENSIONS
|
13 |
+
self.max_file_size = settings.MAX_FILE_SIZE
|
14 |
+
|
15 |
+
def validate_file(self, file_path: str) -> Tuple[bool, str]:
|
16 |
+
"""Validate uploaded file"""
|
17 |
+
if not os.path.exists(file_path):
|
18 |
+
return False, "File does not exist"
|
19 |
+
|
20 |
+
# Check file size
|
21 |
+
file_size = os.path.getsize(file_path)
|
22 |
+
if file_size > self.max_file_size:
|
23 |
+
return False, f"File size exceeds maximum allowed size of {self.max_file_size} bytes"
|
24 |
+
|
25 |
+
# Check file extension
|
26 |
+
file_ext = os.path.splitext(file_path)[1].lower()
|
27 |
+
if file_ext not in self.allowed_extensions:
|
28 |
+
return False, f"File type not allowed. Allowed types: {', '.join(self.allowed_extensions)}"
|
29 |
+
|
30 |
+
return True, "File is valid"
|
31 |
+
|
32 |
+
def extract_text(self, file_path: str) -> Optional[str]:
|
33 |
+
"""Extract text content from PDF file"""
|
34 |
+
try:
|
35 |
+
with open(file_path, 'rb') as file:
|
36 |
+
pdf_reader = PyPDF2.PdfReader(file)
|
37 |
+
text_content = []
|
38 |
+
|
39 |
+
for page_num in range(len(pdf_reader.pages)):
|
40 |
+
try:
|
41 |
+
page = pdf_reader.pages[page_num]
|
42 |
+
text = page.extract_text()
|
43 |
+
if text.strip():
|
44 |
+
text_content.append(f"Page {page_num + 1}:\n{text.strip()}")
|
45 |
+
except Exception as e:
|
46 |
+
logger.warning(f"Error extracting text from page {page_num + 1}: {e}")
|
47 |
+
continue
|
48 |
+
|
49 |
+
return "\n\n".join(text_content)
|
50 |
+
|
51 |
+
except Exception as e:
|
52 |
+
logger.error(f"Error processing PDF file {file_path}: {e}")
|
53 |
+
return None
|
54 |
+
|
55 |
+
def get_metadata(self, file_path: str) -> dict:
|
56 |
+
"""Extract metadata from PDF file"""
|
57 |
+
try:
|
58 |
+
with open(file_path, 'rb') as file:
|
59 |
+
pdf_reader = PyPDF2.PdfReader(file)
|
60 |
+
metadata = {
|
61 |
+
'num_pages': len(pdf_reader.pages),
|
62 |
+
'file_size': os.path.getsize(file_path),
|
63 |
+
'title': None,
|
64 |
+
'author': None,
|
65 |
+
'subject': None,
|
66 |
+
'creator': None
|
67 |
+
}
|
68 |
+
|
69 |
+
if pdf_reader.metadata:
|
70 |
+
metadata.update({
|
71 |
+
'title': pdf_reader.metadata.get('/Title'),
|
72 |
+
'author': pdf_reader.metadata.get('/Author'),
|
73 |
+
'subject': pdf_reader.metadata.get('/Subject'),
|
74 |
+
'creator': pdf_reader.metadata.get('/Creator')
|
75 |
+
})
|
76 |
+
|
77 |
+
return metadata
|
78 |
+
|
79 |
+
except Exception as e:
|
80 |
+
logger.error(f"Error extracting metadata from PDF file {file_path}: {e}")
|
81 |
+
return {
|
82 |
+
'num_pages': 0,
|
83 |
+
'file_size': os.path.getsize(file_path) if os.path.exists(file_path) else 0,
|
84 |
+
'title': None,
|
85 |
+
'author': None,
|
86 |
+
'subject': None,
|
87 |
+
'creator': None
|
88 |
+
}
|
89 |
+
|
90 |
+
def process_pdf(self, file_path: str) -> Tuple[bool, str, dict]:
|
91 |
+
"""Process PDF file and return text content and metadata"""
|
92 |
+
# Validate file
|
93 |
+
is_valid, error_message = self.validate_file(file_path)
|
94 |
+
if not is_valid:
|
95 |
+
return False, error_message, {}
|
96 |
+
|
97 |
+
# Extract text
|
98 |
+
text_content = self.extract_text(file_path)
|
99 |
+
if text_content is None:
|
100 |
+
return False, "Failed to extract text from PDF", {}
|
101 |
+
|
102 |
+
# Get metadata
|
103 |
+
metadata = self.get_metadata(file_path)
|
104 |
+
|
105 |
+
return True, text_content, metadata
|
backend/app/services/vector_store.py
ADDED
@@ -0,0 +1,194 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import chromadb
|
2 |
+
from chromadb.config import Settings as ChromaSettings
|
3 |
+
from typing import List, Dict, Optional, Tuple
|
4 |
+
import json
|
5 |
+
import logging
|
6 |
+
from app.core.config import settings
|
7 |
+
|
8 |
+
logger = logging.getLogger(__name__)
|
9 |
+
|
10 |
+
|
11 |
+
class VectorStore:
|
12 |
+
_instance = None
|
13 |
+
|
14 |
+
def __new__(cls):
|
15 |
+
if cls._instance is None:
|
16 |
+
cls._instance = super(VectorStore, cls).__new__(cls)
|
17 |
+
cls._instance._initialized = False
|
18 |
+
return cls._instance
|
19 |
+
|
20 |
+
def __init__(self):
|
21 |
+
if not self._initialized:
|
22 |
+
self.client = chromadb.PersistentClient(
|
23 |
+
path=settings.CHROMA_PERSIST_DIRECTORY,
|
24 |
+
settings=ChromaSettings(
|
25 |
+
anonymized_telemetry=False
|
26 |
+
)
|
27 |
+
)
|
28 |
+
self.collection_name = "pdf_documents"
|
29 |
+
self.collection = self._get_or_create_collection()
|
30 |
+
self._initialized = True
|
31 |
+
|
32 |
+
def _get_or_create_collection(self):
|
33 |
+
"""Get existing collection or create new one"""
|
34 |
+
try:
|
35 |
+
collection = self.client.get_collection(name=self.collection_name)
|
36 |
+
logger.info(f"Using existing collection: {self.collection_name}")
|
37 |
+
except Exception:
|
38 |
+
collection = self.client.create_collection(
|
39 |
+
name=self.collection_name,
|
40 |
+
metadata={"description": "PDF document embeddings for Q&A chatbot"}
|
41 |
+
)
|
42 |
+
logger.info(f"Created new collection: {self.collection_name}")
|
43 |
+
|
44 |
+
return collection
|
45 |
+
|
46 |
+
def add_document(self, document_id: str, content: str, metadata: Dict = None) -> bool:
|
47 |
+
"""Add document content to vector store"""
|
48 |
+
try:
|
49 |
+
logger.info(f"Starting to add document {document_id} to vector store")
|
50 |
+
logger.info(f"Content length: {len(content)} characters")
|
51 |
+
|
52 |
+
# Split content into chunks for better retrieval
|
53 |
+
chunks = self._split_text(content, chunk_size=1000, overlap=200)
|
54 |
+
logger.info(f"Split content into {len(chunks)} chunks")
|
55 |
+
|
56 |
+
# Prepare data for ChromaDB
|
57 |
+
ids = [f"{document_id}_chunk_{i}" for i in range(len(chunks))]
|
58 |
+
documents = chunks
|
59 |
+
metadatas = [{
|
60 |
+
"document_id": document_id,
|
61 |
+
"chunk_index": i,
|
62 |
+
**(metadata or {})
|
63 |
+
} for i in range(len(chunks))]
|
64 |
+
|
65 |
+
logger.info(f"Prepared {len(ids)} chunks with IDs: {ids[:3]}...") # Log first 3 IDs
|
66 |
+
|
67 |
+
# Add to collection
|
68 |
+
logger.info(f"Adding chunks to ChromaDB collection: {self.collection_name}")
|
69 |
+
self.collection.add(
|
70 |
+
ids=ids,
|
71 |
+
documents=documents,
|
72 |
+
metadatas=metadatas
|
73 |
+
)
|
74 |
+
|
75 |
+
logger.info(f"Successfully added document {document_id} with {len(chunks)} chunks to vector store")
|
76 |
+
return True
|
77 |
+
|
78 |
+
except Exception as e:
|
79 |
+
logger.error(f"Error adding document {document_id} to vector store: {e}")
|
80 |
+
logger.error(f"Exception type: {type(e).__name__}")
|
81 |
+
import traceback
|
82 |
+
logger.error(f"Full traceback: {traceback.format_exc()}")
|
83 |
+
return False
|
84 |
+
|
85 |
+
def search_similar(self, query: str, n_results: int = 5, document_id: str = None) -> List[Dict]:
|
86 |
+
"""Search for similar documents based on query, optionally filtering by document_id"""
|
87 |
+
try:
|
88 |
+
results = self.collection.query(
|
89 |
+
query_texts=[query],
|
90 |
+
n_results=n_results,
|
91 |
+
include=["documents", "metadatas", "distances"]
|
92 |
+
)
|
93 |
+
|
94 |
+
# Format results
|
95 |
+
formatted_results = []
|
96 |
+
if results['documents'] and results['documents'][0]:
|
97 |
+
for i, (doc, metadata, distance) in enumerate(zip(
|
98 |
+
results['documents'][0],
|
99 |
+
results['metadatas'][0],
|
100 |
+
results['distances'][0]
|
101 |
+
)):
|
102 |
+
if document_id is not None and str(metadata.get('document_id')) != str(document_id):
|
103 |
+
continue
|
104 |
+
formatted_results.append({
|
105 |
+
'content': doc,
|
106 |
+
'metadata': metadata,
|
107 |
+
'similarity_score': 1 - distance, # Convert distance to similarity
|
108 |
+
'rank': i + 1
|
109 |
+
})
|
110 |
+
return formatted_results
|
111 |
+
except Exception as e:
|
112 |
+
logger.error(f"Error searching vector store: {e}")
|
113 |
+
return []
|
114 |
+
|
115 |
+
def delete_document(self, document_id: str) -> bool:
|
116 |
+
"""Delete all chunks for a specific document"""
|
117 |
+
try:
|
118 |
+
# Get all chunks for this document
|
119 |
+
results = self.collection.get(
|
120 |
+
where={"document_id": document_id}
|
121 |
+
)
|
122 |
+
|
123 |
+
if results['ids']:
|
124 |
+
self.collection.delete(ids=results['ids'])
|
125 |
+
logger.info(f"Deleted {len(results['ids'])} chunks for document {document_id}")
|
126 |
+
|
127 |
+
return True
|
128 |
+
|
129 |
+
except Exception as e:
|
130 |
+
logger.error(f"Error deleting document {document_id} from vector store: {e}")
|
131 |
+
return False
|
132 |
+
|
133 |
+
def get_collection_stats(self) -> Dict:
|
134 |
+
"""Get statistics about the vector store collection"""
|
135 |
+
try:
|
136 |
+
logger.info(f"Getting stats for collection: {self.collection_name}")
|
137 |
+
count = self.collection.count()
|
138 |
+
logger.info(f"Collection count: {count}")
|
139 |
+
return {
|
140 |
+
"total_documents": count,
|
141 |
+
"collection_name": self.collection_name
|
142 |
+
}
|
143 |
+
except Exception as e:
|
144 |
+
logger.error(f"Error getting collection stats: {e}")
|
145 |
+
logger.error(f"Exception type: {type(e).__name__}")
|
146 |
+
import traceback
|
147 |
+
logger.error(f"Full traceback: {traceback.format_exc()}")
|
148 |
+
return {"total_documents": 0, "collection_name": self.collection_name}
|
149 |
+
|
150 |
+
def _split_text(self, text: str, chunk_size: int = 1000, overlap: int = 200) -> List[str]:
|
151 |
+
"""Split text into overlapping chunks"""
|
152 |
+
if len(text) <= chunk_size:
|
153 |
+
return [text]
|
154 |
+
|
155 |
+
chunks = []
|
156 |
+
start = 0
|
157 |
+
|
158 |
+
while start < len(text):
|
159 |
+
end = start + chunk_size
|
160 |
+
|
161 |
+
# If this isn't the last chunk, try to break at a sentence boundary
|
162 |
+
if end < len(text):
|
163 |
+
# Look for sentence endings
|
164 |
+
for i in range(end, max(start + chunk_size - 100, start), -1):
|
165 |
+
if text[i] in '.!?':
|
166 |
+
end = i + 1
|
167 |
+
break
|
168 |
+
|
169 |
+
chunk = text[start:end].strip()
|
170 |
+
if chunk:
|
171 |
+
chunks.append(chunk)
|
172 |
+
|
173 |
+
# Move start position with overlap
|
174 |
+
start = end - overlap
|
175 |
+
if start >= len(text):
|
176 |
+
break
|
177 |
+
|
178 |
+
return chunks
|
179 |
+
|
180 |
+
def clear_all(self) -> bool:
|
181 |
+
"""Clear all documents from the vector store"""
|
182 |
+
try:
|
183 |
+
self.client.delete_collection(name=self.collection_name)
|
184 |
+
self.collection = self._get_or_create_collection()
|
185 |
+
logger.info("Cleared all documents from vector store")
|
186 |
+
return True
|
187 |
+
except Exception as e:
|
188 |
+
logger.error(f"Error clearing vector store: {e}")
|
189 |
+
return False
|
190 |
+
|
191 |
+
@classmethod
|
192 |
+
def reset_instance(cls):
|
193 |
+
"""Reset the singleton instance - useful after clearing collections"""
|
194 |
+
cls._instance = None
|
backend/main.py
ADDED
@@ -0,0 +1,125 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from fastapi import FastAPI
|
2 |
+
from fastapi.middleware.cors import CORSMiddleware
|
3 |
+
from fastapi.staticfiles import StaticFiles
|
4 |
+
import logging
|
5 |
+
import os
|
6 |
+
|
7 |
+
from app.core.config import settings
|
8 |
+
from app.core.database import create_tables, SessionLocal
|
9 |
+
from app.models.document import Document, ChatMessage
|
10 |
+
import shutil
|
11 |
+
from app.api.endpoints import documents, chat
|
12 |
+
from app.services.vector_store import VectorStore
|
13 |
+
|
14 |
+
# Configure logging
|
15 |
+
logging.basicConfig(
|
16 |
+
level=logging.INFO,
|
17 |
+
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s"
|
18 |
+
)
|
19 |
+
|
20 |
+
# Create FastAPI app
|
21 |
+
app = FastAPI(
|
22 |
+
title=settings.PROJECT_NAME,
|
23 |
+
description="A comprehensive PDF-based Q&A chatbot system",
|
24 |
+
version="1.0.0",
|
25 |
+
docs_url="/docs",
|
26 |
+
redoc_url="/redoc"
|
27 |
+
)
|
28 |
+
|
29 |
+
# Add CORS middleware
|
30 |
+
app.add_middleware(
|
31 |
+
CORSMiddleware,
|
32 |
+
allow_origins=settings.BACKEND_CORS_ORIGINS,
|
33 |
+
allow_credentials=True,
|
34 |
+
allow_methods=["*"],
|
35 |
+
allow_headers=["*"],
|
36 |
+
)
|
37 |
+
|
38 |
+
# Include API routes
|
39 |
+
app.include_router(
|
40 |
+
documents.router,
|
41 |
+
prefix=f"{settings.API_V1_STR}/documents",
|
42 |
+
tags=["documents"]
|
43 |
+
)
|
44 |
+
|
45 |
+
app.include_router(
|
46 |
+
chat.router,
|
47 |
+
prefix=f"{settings.API_V1_STR}/chat",
|
48 |
+
tags=["chat"]
|
49 |
+
)
|
50 |
+
|
51 |
+
# Health check endpoint
|
52 |
+
@app.get("/health")
|
53 |
+
def health_check():
|
54 |
+
"""Health check endpoint"""
|
55 |
+
return {
|
56 |
+
"status": "healthy",
|
57 |
+
"service": settings.PROJECT_NAME,
|
58 |
+
"version": "1.0.0"
|
59 |
+
}
|
60 |
+
|
61 |
+
# Root endpoint
|
62 |
+
@app.get("/")
|
63 |
+
def root():
|
64 |
+
"""Root endpoint with API information"""
|
65 |
+
return {
|
66 |
+
"message": "PDF Q&A Chatbot API",
|
67 |
+
"version": "1.0.0",
|
68 |
+
"docs": "/docs",
|
69 |
+
"health": "/health"
|
70 |
+
}
|
71 |
+
|
72 |
+
# Startup event
|
73 |
+
@app.on_event("startup")
|
74 |
+
async def startup_event():
|
75 |
+
"""Initialize application on startup"""
|
76 |
+
# Create database tables
|
77 |
+
create_tables()
|
78 |
+
|
79 |
+
# Ensure directories exist
|
80 |
+
os.makedirs(settings.UPLOAD_DIR, exist_ok=True)
|
81 |
+
os.makedirs(settings.CHROMA_PERSIST_DIRECTORY, exist_ok=True)
|
82 |
+
|
83 |
+
# --- ERASE ALL DOCUMENTS, CHAT MESSAGES, AND VECTORS ON STARTUP ---
|
84 |
+
# 1. Delete all rows from documents and chat_messages tables
|
85 |
+
db = SessionLocal()
|
86 |
+
try:
|
87 |
+
db.query(Document).delete()
|
88 |
+
db.query(ChatMessage).delete()
|
89 |
+
db.commit()
|
90 |
+
finally:
|
91 |
+
db.close()
|
92 |
+
# 2. Remove all files in chroma_db directory (but keep the directory)
|
93 |
+
chroma_dir = settings.CHROMA_PERSIST_DIRECTORY
|
94 |
+
for filename in os.listdir(chroma_dir):
|
95 |
+
file_path = os.path.join(chroma_dir, filename)
|
96 |
+
try:
|
97 |
+
if os.path.isfile(file_path) or os.path.islink(file_path):
|
98 |
+
os.unlink(file_path)
|
99 |
+
elif os.path.isdir(file_path):
|
100 |
+
shutil.rmtree(file_path)
|
101 |
+
except Exception as e:
|
102 |
+
logging.warning(f"Failed to delete {file_path}: {e}")
|
103 |
+
# 3. Explicitly clear ChromaDB vector store
|
104 |
+
vector_store = VectorStore()
|
105 |
+
vector_store.clear_all()
|
106 |
+
logging.info("All documents, chat messages, and vector store erased on startup.")
|
107 |
+
# --- END ERASE ---
|
108 |
+
|
109 |
+
logging.info("Application started successfully")
|
110 |
+
|
111 |
+
# Shutdown event
|
112 |
+
@app.on_event("shutdown")
|
113 |
+
async def shutdown_event():
|
114 |
+
"""Cleanup on application shutdown"""
|
115 |
+
logging.info("Application shutting down")
|
116 |
+
|
117 |
+
if __name__ == "__main__":
|
118 |
+
import uvicorn
|
119 |
+
uvicorn.run(
|
120 |
+
"main:app",
|
121 |
+
host="0.0.0.0",
|
122 |
+
port=8000,
|
123 |
+
reload=True,
|
124 |
+
log_level="info"
|
125 |
+
)
|
backend/requirements.txt
ADDED
@@ -0,0 +1,19 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
fastapi==0.104.1
|
2 |
+
uvicorn[standard]==0.24.0
|
3 |
+
python-multipart==0.0.6
|
4 |
+
pydantic==2.5.0
|
5 |
+
pydantic-settings==2.1.0
|
6 |
+
sqlalchemy==2.0.23
|
7 |
+
alembic==1.13.0
|
8 |
+
chromadb==0.4.18
|
9 |
+
openai==1.3.7
|
10 |
+
anthropic==0.7.8
|
11 |
+
pypdf2==3.0.1
|
12 |
+
python-dotenv==1.0.0
|
13 |
+
aiofiles==23.2.1
|
14 |
+
python-jose[cryptography]==3.3.0
|
15 |
+
passlib[bcrypt]==1.7.4
|
16 |
+
python-multipart==0.0.6
|
17 |
+
httpx==0.25.2
|
18 |
+
pytest==7.4.3
|
19 |
+
pytest-asyncio==0.21.1
|
backend/test_openrouter.py
ADDED
@@ -0,0 +1,93 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/usr/bin/env python3
|
2 |
+
"""
|
3 |
+
Test script to verify OpenRouter API connectivity
|
4 |
+
"""
|
5 |
+
|
6 |
+
import os
|
7 |
+
import httpx
|
8 |
+
import json
|
9 |
+
from dotenv import load_dotenv
|
10 |
+
|
11 |
+
# Load environment variables
|
12 |
+
load_dotenv()
|
13 |
+
|
14 |
+
def test_openrouter_api():
|
15 |
+
"""Test OpenRouter API connection"""
|
16 |
+
api_key = os.getenv("OPENROUTER_API_KEY")
|
17 |
+
|
18 |
+
if not api_key:
|
19 |
+
print("β No OpenRouter API key found in environment variables")
|
20 |
+
return False
|
21 |
+
|
22 |
+
print(f"π API Key found: {api_key[:10]}...{api_key[-4:]}")
|
23 |
+
|
24 |
+
# Test API endpoint
|
25 |
+
url = "https://openrouter.ai/api/v1/chat/completions"
|
26 |
+
|
27 |
+
headers = {
|
28 |
+
"Authorization": f"Bearer {api_key}",
|
29 |
+
"HTTP-Referer": "http://localhost:3000",
|
30 |
+
"X-Title": "PDF Q&A Chatbot Test",
|
31 |
+
"Content-Type": "application/json"
|
32 |
+
}
|
33 |
+
|
34 |
+
payload = {
|
35 |
+
"model": "meta-llama/llama-3-70b-instruct",
|
36 |
+
"messages": [
|
37 |
+
{"role": "user", "content": "Hello! Please respond with 'API test successful' if you can see this message."}
|
38 |
+
],
|
39 |
+
"max_tokens": 50,
|
40 |
+
"temperature": 0.1
|
41 |
+
}
|
42 |
+
|
43 |
+
print(f"π Making request to: {url}")
|
44 |
+
print(f"π€ Request payload: {json.dumps(payload, indent=2)}")
|
45 |
+
|
46 |
+
try:
|
47 |
+
with httpx.Client(timeout=30) as client:
|
48 |
+
response = client.post(url, headers=headers, json=payload)
|
49 |
+
|
50 |
+
print(f"π₯ Response status: {response.status_code}")
|
51 |
+
print(f"π₯ Response headers: {dict(response.headers)}")
|
52 |
+
|
53 |
+
# Log raw response
|
54 |
+
response_text = response.text
|
55 |
+
print(f"π₯ Raw response: {response_text}")
|
56 |
+
|
57 |
+
if not response_text.strip():
|
58 |
+
print("β Empty response received")
|
59 |
+
return False
|
60 |
+
|
61 |
+
if response.status_code != 200:
|
62 |
+
print(f"β HTTP error: {response.status_code}")
|
63 |
+
return False
|
64 |
+
|
65 |
+
# Try to parse JSON
|
66 |
+
try:
|
67 |
+
data = response.json()
|
68 |
+
print(f"β
JSON parsed successfully: {json.dumps(data, indent=2)}")
|
69 |
+
|
70 |
+
if "choices" in data and data["choices"]:
|
71 |
+
answer = data["choices"][0]["message"]["content"]
|
72 |
+
print(f"π€ AI Response: {answer}")
|
73 |
+
return True
|
74 |
+
else:
|
75 |
+
print("β No choices in response")
|
76 |
+
return False
|
77 |
+
|
78 |
+
except json.JSONDecodeError as e:
|
79 |
+
print(f"β JSON parsing failed: {e}")
|
80 |
+
return False
|
81 |
+
|
82 |
+
except Exception as e:
|
83 |
+
print(f"β Request failed: {e}")
|
84 |
+
return False
|
85 |
+
|
86 |
+
if __name__ == "__main__":
|
87 |
+
print("π§ͺ Testing OpenRouter API connectivity...")
|
88 |
+
success = test_openrouter_api()
|
89 |
+
|
90 |
+
if success:
|
91 |
+
print("β
OpenRouter API test successful!")
|
92 |
+
else:
|
93 |
+
print("β OpenRouter API test failed!")
|
docker-compose.yml
ADDED
@@ -0,0 +1,42 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
version: '3.8'
|
2 |
+
|
3 |
+
services:
|
4 |
+
backend:
|
5 |
+
build:
|
6 |
+
context: ./backend
|
7 |
+
dockerfile: Dockerfile
|
8 |
+
ports:
|
9 |
+
- "8000:8000"
|
10 |
+
environment:
|
11 |
+
- DATABASE_URL=sqlite:///./pdf_chatbot.db
|
12 |
+
- CHROMA_PERSIST_DIRECTORY=./chroma_db
|
13 |
+
- UPLOAD_DIR=./uploads
|
14 |
+
- MAX_FILE_SIZE=10485760
|
15 |
+
- ALLOWED_EXTENSIONS=[".pdf"]
|
16 |
+
- BACKEND_CORS_ORIGINS=["http://localhost:3000","http://localhost:3001","http://127.0.0.1:3000","http://127.0.0.1:3001"]
|
17 |
+
env_file:
|
18 |
+
- ./backend/.env
|
19 |
+
volumes:
|
20 |
+
- ./backend/uploads:/app/uploads
|
21 |
+
- ./backend/chroma_db:/app/chroma_db
|
22 |
+
- ./backend/pdf_chatbot.db:/app/pdf_chatbot.db
|
23 |
+
restart: unless-stopped
|
24 |
+
|
25 |
+
frontend:
|
26 |
+
build:
|
27 |
+
context: ./frontend
|
28 |
+
dockerfile: Dockerfile
|
29 |
+
ports:
|
30 |
+
- "3000:3000"
|
31 |
+
environment:
|
32 |
+
- NEXT_PUBLIC_API_URL=http://localhost:8000
|
33 |
+
env_file:
|
34 |
+
- ./frontend/.env
|
35 |
+
depends_on:
|
36 |
+
- backend
|
37 |
+
restart: unless-stopped
|
38 |
+
|
39 |
+
volumes:
|
40 |
+
uploads:
|
41 |
+
chroma_db:
|
42 |
+
database:
|
frontend/Dockerfile
ADDED
@@ -0,0 +1,25 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
FROM node:18-alpine
|
2 |
+
|
3 |
+
WORKDIR /app
|
4 |
+
|
5 |
+
# Copy package files
|
6 |
+
COPY package*.json ./
|
7 |
+
|
8 |
+
# Install dependencies
|
9 |
+
RUN npm ci --only=production
|
10 |
+
|
11 |
+
# Copy application code
|
12 |
+
COPY . .
|
13 |
+
|
14 |
+
# Build the application
|
15 |
+
RUN npm run build
|
16 |
+
|
17 |
+
# Expose port
|
18 |
+
EXPOSE 3000
|
19 |
+
|
20 |
+
# Health check
|
21 |
+
HEALTHCHECK --interval=30s --timeout=30s --start-period=5s --retries=3 \
|
22 |
+
CMD curl -f http://localhost:3000 || exit 1
|
23 |
+
|
24 |
+
# Run the application
|
25 |
+
CMD ["npm", "start"]
|
frontend/app/globals.css
ADDED
@@ -0,0 +1,107 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
@tailwind base;
|
2 |
+
@tailwind components;
|
3 |
+
@tailwind utilities;
|
4 |
+
|
5 |
+
@layer base {
|
6 |
+
:root {
|
7 |
+
--background: 0 0% 100%;
|
8 |
+
--foreground: 222.2 84% 4.9%;
|
9 |
+
--card: 0 0% 100%;
|
10 |
+
--card-foreground: 222.2 84% 4.9%;
|
11 |
+
--popover: 0 0% 100%;
|
12 |
+
--popover-foreground: 222.2 84% 4.9%;
|
13 |
+
--primary: 221.2 83.2% 53.3%;
|
14 |
+
--primary-foreground: 210 40% 98%;
|
15 |
+
--secondary: 210 40% 96%;
|
16 |
+
--secondary-foreground: 222.2 84% 4.9%;
|
17 |
+
--muted: 210 40% 96%;
|
18 |
+
--muted-foreground: 215.4 16.3% 46.9%;
|
19 |
+
--accent: 210 40% 96%;
|
20 |
+
--accent-foreground: 222.2 84% 4.9%;
|
21 |
+
--destructive: 0 84.2% 60.2%;
|
22 |
+
--destructive-foreground: 210 40% 98%;
|
23 |
+
--border: 214.3 31.8% 91.4%;
|
24 |
+
--input: 214.3 31.8% 91.4%;
|
25 |
+
--ring: 221.2 83.2% 53.3%;
|
26 |
+
--radius: 0.5rem;
|
27 |
+
}
|
28 |
+
|
29 |
+
.dark {
|
30 |
+
--background: 222.2 84% 4.9%;
|
31 |
+
--foreground: 210 40% 98%;
|
32 |
+
--card: 222.2 84% 4.9%;
|
33 |
+
--card-foreground: 210 40% 98%;
|
34 |
+
--popover: 222.2 84% 4.9%;
|
35 |
+
--popover-foreground: 210 40% 98%;
|
36 |
+
--primary: 217.2 91.2% 59.8%;
|
37 |
+
--primary-foreground: 222.2 84% 4.9%;
|
38 |
+
--secondary: 217.2 32.6% 17.5%;
|
39 |
+
--secondary-foreground: 210 40% 98%;
|
40 |
+
--muted: 217.2 32.6% 17.5%;
|
41 |
+
--muted-foreground: 215 20.2% 65.1%;
|
42 |
+
--accent: 217.2 32.6% 17.5%;
|
43 |
+
--accent-foreground: 210 40% 98%;
|
44 |
+
--destructive: 0 62.8% 30.6%;
|
45 |
+
--destructive-foreground: 210 40% 98%;
|
46 |
+
--border: 217.2 32.6% 17.5%;
|
47 |
+
--input: 217.2 32.6% 17.5%;
|
48 |
+
--ring: 224.3 76.3% 94.1%;
|
49 |
+
}
|
50 |
+
}
|
51 |
+
|
52 |
+
@layer base {
|
53 |
+
* {
|
54 |
+
@apply border-border;
|
55 |
+
}
|
56 |
+
body {
|
57 |
+
@apply bg-background text-foreground;
|
58 |
+
}
|
59 |
+
}
|
60 |
+
|
61 |
+
/* Custom scrollbar */
|
62 |
+
::-webkit-scrollbar {
|
63 |
+
width: 6px;
|
64 |
+
}
|
65 |
+
|
66 |
+
::-webkit-scrollbar-track {
|
67 |
+
background: hsl(var(--muted));
|
68 |
+
}
|
69 |
+
|
70 |
+
::-webkit-scrollbar-thumb {
|
71 |
+
background: hsl(var(--muted-foreground));
|
72 |
+
border-radius: 3px;
|
73 |
+
}
|
74 |
+
|
75 |
+
::-webkit-scrollbar-thumb:hover {
|
76 |
+
background: hsl(var(--foreground));
|
77 |
+
}
|
78 |
+
|
79 |
+
/* Chat message animations */
|
80 |
+
@keyframes slideIn {
|
81 |
+
from {
|
82 |
+
opacity: 0;
|
83 |
+
transform: translateY(10px);
|
84 |
+
}
|
85 |
+
to {
|
86 |
+
opacity: 1;
|
87 |
+
transform: translateY(0);
|
88 |
+
}
|
89 |
+
}
|
90 |
+
|
91 |
+
.chat-message {
|
92 |
+
animation: slideIn 0.3s ease-out;
|
93 |
+
}
|
94 |
+
|
95 |
+
/* Loading animation */
|
96 |
+
@keyframes pulse {
|
97 |
+
0%, 100% {
|
98 |
+
opacity: 1;
|
99 |
+
}
|
100 |
+
50% {
|
101 |
+
opacity: 0.5;
|
102 |
+
}
|
103 |
+
}
|
104 |
+
|
105 |
+
.animate-pulse {
|
106 |
+
animation: pulse 2s cubic-bezier(0.4, 0, 0.6, 1) infinite;
|
107 |
+
}
|
frontend/app/layout.tsx
ADDED
@@ -0,0 +1,26 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import type { Metadata } from 'next'
|
2 |
+
import { Inter } from 'next/font/google'
|
3 |
+
import './globals.css'
|
4 |
+
|
5 |
+
const inter = Inter({ subsets: ['latin'] })
|
6 |
+
|
7 |
+
export const metadata: Metadata = {
|
8 |
+
title: 'PDF Q&A Chatbot',
|
9 |
+
description: 'A comprehensive PDF-based Q&A chatbot system',
|
10 |
+
}
|
11 |
+
|
12 |
+
export default function RootLayout({
|
13 |
+
children,
|
14 |
+
}: {
|
15 |
+
children: React.ReactNode
|
16 |
+
}) {
|
17 |
+
return (
|
18 |
+
<html lang="en">
|
19 |
+
<body className={inter.className}>
|
20 |
+
<div className="min-h-screen bg-background">
|
21 |
+
{children}
|
22 |
+
</div>
|
23 |
+
</body>
|
24 |
+
</html>
|
25 |
+
)
|
26 |
+
}
|
frontend/app/page.tsx
ADDED
@@ -0,0 +1,128 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
'use client'
|
2 |
+
|
3 |
+
import { useState, useEffect, useCallback } from 'react'
|
4 |
+
import { Upload, MessageCircle, FileText, Settings, Send, Trash2 } from 'lucide-react'
|
5 |
+
import ChatInterface from '@/components/ChatInterface'
|
6 |
+
import DocumentUpload from '@/components/DocumentUpload'
|
7 |
+
import DocumentList from '@/components/DocumentList'
|
8 |
+
import { useChatStore } from '@/lib/store'
|
9 |
+
import { apiService } from '@/lib/api'
|
10 |
+
|
11 |
+
export default function Home() {
|
12 |
+
const [activeTab, setActiveTab] = useState<'chat' | 'documents'>('documents')
|
13 |
+
const { sessionId, createNewSession } = useChatStore()
|
14 |
+
const [documentCount, setDocumentCount] = useState(0)
|
15 |
+
|
16 |
+
const fetchDocumentCount = useCallback(async () => {
|
17 |
+
try {
|
18 |
+
const response = await apiService.getDocuments()
|
19 |
+
setDocumentCount(response.documents.length)
|
20 |
+
} catch (e) {
|
21 |
+
setDocumentCount(0)
|
22 |
+
}
|
23 |
+
}, [])
|
24 |
+
|
25 |
+
useEffect(() => {
|
26 |
+
if (!sessionId) {
|
27 |
+
createNewSession()
|
28 |
+
}
|
29 |
+
}, [sessionId, createNewSession])
|
30 |
+
|
31 |
+
useEffect(() => {
|
32 |
+
fetchDocumentCount()
|
33 |
+
}, [activeTab, fetchDocumentCount])
|
34 |
+
|
35 |
+
useEffect(() => {
|
36 |
+
// Clear all data on every page load
|
37 |
+
fetch('/api/v1/documents/clear_all', { method: 'POST' })
|
38 |
+
.then(() => fetchDocumentCount())
|
39 |
+
.catch(() => fetchDocumentCount())
|
40 |
+
}, [])
|
41 |
+
|
42 |
+
return (
|
43 |
+
<div className="flex h-screen bg-gray-50">
|
44 |
+
{/* Sidebar */}
|
45 |
+
<div className="w-64 bg-white border-r border-gray-200 flex flex-col">
|
46 |
+
<div className="p-6 border-b border-gray-200">
|
47 |
+
<h1 className="text-xl font-bold text-gray-900">PDF Q&A Chatbot</h1>
|
48 |
+
<p className="text-sm text-gray-600 mt-1">Upload and chat with your documents</p>
|
49 |
+
</div>
|
50 |
+
|
51 |
+
<nav className="flex-1 p-4">
|
52 |
+
<div className="space-y-2">
|
53 |
+
<button
|
54 |
+
onClick={() => setActiveTab('documents')}
|
55 |
+
className={`w-full flex items-center px-3 py-2 text-sm font-medium rounded-md transition-colors ${
|
56 |
+
activeTab === 'documents'
|
57 |
+
? 'bg-blue-100 text-blue-700'
|
58 |
+
: 'text-gray-600 hover:bg-gray-100'
|
59 |
+
}`}
|
60 |
+
>
|
61 |
+
<FileText className="w-4 h-4 mr-3" />
|
62 |
+
Documents
|
63 |
+
</button>
|
64 |
+
<button
|
65 |
+
onClick={() => setActiveTab('chat')}
|
66 |
+
className={`w-full flex items-center px-3 py-2 text-sm font-medium rounded-md transition-colors ${
|
67 |
+
activeTab === 'chat'
|
68 |
+
? 'bg-blue-100 text-blue-700'
|
69 |
+
: 'text-gray-600 hover:bg-gray-100'
|
70 |
+
}`}
|
71 |
+
>
|
72 |
+
<MessageCircle className="w-4 h-4 mr-3" />
|
73 |
+
Chat
|
74 |
+
</button>
|
75 |
+
</div>
|
76 |
+
</nav>
|
77 |
+
{/* Removed Upload PDF button from sidebar */}
|
78 |
+
</div>
|
79 |
+
|
80 |
+
{/* Main Content */}
|
81 |
+
<div className="flex-1 flex flex-col">
|
82 |
+
{activeTab === 'chat' && (
|
83 |
+
<div className="flex-1 flex flex-col">
|
84 |
+
<div className="p-6 border-b border-gray-200">
|
85 |
+
<h2 className="text-lg font-semibold text-gray-900">Chat with Documents</h2>
|
86 |
+
<p className="text-sm text-gray-600 mt-1">
|
87 |
+
Ask questions about your uploaded PDF documents
|
88 |
+
</p>
|
89 |
+
</div>
|
90 |
+
<div className="flex-1 overflow-hidden">
|
91 |
+
<ChatInterface />
|
92 |
+
</div>
|
93 |
+
</div>
|
94 |
+
)}
|
95 |
+
|
96 |
+
{activeTab === 'documents' && (
|
97 |
+
<div className="flex-1 flex flex-col">
|
98 |
+
<div className="p-6 border-b border-gray-200">
|
99 |
+
<h2 className="text-lg font-semibold text-gray-900">Document Management</h2>
|
100 |
+
<p className="text-sm text-gray-600 mt-1">
|
101 |
+
Upload, view, and manage your PDF documents
|
102 |
+
</p>
|
103 |
+
</div>
|
104 |
+
<div className="flex-1 overflow-auto">
|
105 |
+
<div className="p-6">
|
106 |
+
<DocumentUpload disabled={documentCount >= 3} onDocumentChange={fetchDocumentCount} />
|
107 |
+
<div className="mt-8">
|
108 |
+
<DocumentList onDocumentChange={fetchDocumentCount} />
|
109 |
+
</div>
|
110 |
+
</div>
|
111 |
+
</div>
|
112 |
+
</div>
|
113 |
+
)}
|
114 |
+
{/* Add fixed Upload PDF button at bottom left */}
|
115 |
+
<div className="fixed bottom-6 left-6 z-50">
|
116 |
+
<button
|
117 |
+
onClick={() => setActiveTab('documents')}
|
118 |
+
className={`w-48 flex items-center justify-center px-3 py-2 text-sm font-medium text-white bg-blue-600 rounded-md transition-colors shadow-lg ${documentCount >= 3 ? 'opacity-60 cursor-not-allowed' : ''}`}
|
119 |
+
disabled={documentCount >= 3}
|
120 |
+
>
|
121 |
+
<Upload className="w-4 h-4 mr-2" />
|
122 |
+
Upload PDF
|
123 |
+
</button>
|
124 |
+
</div>
|
125 |
+
</div>
|
126 |
+
</div>
|
127 |
+
)
|
128 |
+
}
|
frontend/components/ChatInterface.tsx
ADDED
@@ -0,0 +1,240 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
'use client'
|
2 |
+
|
3 |
+
import { useState, useRef, useEffect } from 'react'
|
4 |
+
import { Send, Bot, User, Loader2 } from 'lucide-react'
|
5 |
+
import { useChatStore, ChatMessage } from '@/lib/store'
|
6 |
+
import { apiService } from '@/lib/api'
|
7 |
+
import ReactMarkdown from 'react-markdown'
|
8 |
+
import remarkGfm from 'remark-gfm'
|
9 |
+
|
10 |
+
export default function ChatInterface() {
|
11 |
+
const [input, setInput] = useState('')
|
12 |
+
const [isLoading, setIsLoading] = useState(false)
|
13 |
+
const messagesEndRef = useRef<HTMLDivElement>(null)
|
14 |
+
const { sessionId, messages, addMessage, setLoading } = useChatStore()
|
15 |
+
|
16 |
+
const scrollToBottom = () => {
|
17 |
+
messagesEndRef.current?.scrollIntoView({ behavior: 'smooth' })
|
18 |
+
}
|
19 |
+
|
20 |
+
useEffect(() => {
|
21 |
+
scrollToBottom()
|
22 |
+
}, [messages])
|
23 |
+
|
24 |
+
const handleSubmit = async (e: React.FormEvent) => {
|
25 |
+
e.preventDefault()
|
26 |
+
if (!input.trim() || !sessionId || isLoading) return
|
27 |
+
|
28 |
+
const userMessage = input.trim()
|
29 |
+
setInput('')
|
30 |
+
setIsLoading(true)
|
31 |
+
setLoading(true)
|
32 |
+
|
33 |
+
// Add user message
|
34 |
+
addMessage({
|
35 |
+
content: userMessage,
|
36 |
+
type: 'user',
|
37 |
+
})
|
38 |
+
|
39 |
+
try {
|
40 |
+
const response = await apiService.sendMessage({
|
41 |
+
question: userMessage,
|
42 |
+
session_id: sessionId,
|
43 |
+
})
|
44 |
+
|
45 |
+
// Add assistant message
|
46 |
+
addMessage({
|
47 |
+
content: response.answer,
|
48 |
+
type: 'assistant',
|
49 |
+
sources: response.sources,
|
50 |
+
})
|
51 |
+
} catch (error) {
|
52 |
+
console.error('Error sending message:', error)
|
53 |
+
addMessage({
|
54 |
+
content: 'Sorry, I encountered an error while processing your request. Please try again.',
|
55 |
+
type: 'assistant',
|
56 |
+
})
|
57 |
+
} finally {
|
58 |
+
setIsLoading(false)
|
59 |
+
setLoading(false)
|
60 |
+
}
|
61 |
+
}
|
62 |
+
|
63 |
+
const formatTimestamp = (timestamp: Date) => {
|
64 |
+
return timestamp.toLocaleTimeString([], { hour: '2-digit', minute: '2-digit' })
|
65 |
+
}
|
66 |
+
|
67 |
+
return (
|
68 |
+
<div className="flex flex-col h-full">
|
69 |
+
{/* Messages */}
|
70 |
+
<div className="flex-1 overflow-y-auto p-4 space-y-4">
|
71 |
+
{messages.length === 0 ? (
|
72 |
+
<div className="flex items-center justify-center h-full text-gray-500">
|
73 |
+
<div className="text-center">
|
74 |
+
<Bot className="w-12 h-12 mx-auto mb-4 text-gray-300" />
|
75 |
+
<p className="text-lg font-medium">Start a conversation</p>
|
76 |
+
<p className="text-sm">Ask questions about your uploaded documents</p>
|
77 |
+
</div>
|
78 |
+
</div>
|
79 |
+
) : (
|
80 |
+
messages.map((message) => (
|
81 |
+
<div
|
82 |
+
key={message.id}
|
83 |
+
className={`flex ${message.type === 'user' ? 'justify-end' : 'justify-start'}`}
|
84 |
+
>
|
85 |
+
<div
|
86 |
+
className={`max-w-[75%] rounded-xl px-4 py-3 shadow-sm ${
|
87 |
+
message.type === 'user'
|
88 |
+
? 'bg-blue-600 text-white'
|
89 |
+
: 'bg-white text-gray-900 border border-gray-200'
|
90 |
+
}`}
|
91 |
+
>
|
92 |
+
<div className="flex items-start space-x-2">
|
93 |
+
{message.type === 'assistant' && (
|
94 |
+
<Bot className="w-4 h-4 mt-1 flex-shrink-0" />
|
95 |
+
)}
|
96 |
+
<div className="flex-1">
|
97 |
+
<ReactMarkdown
|
98 |
+
remarkPlugins={[remarkGfm]}
|
99 |
+
className="prose prose-sm max-w-none leading-relaxed"
|
100 |
+
components={{
|
101 |
+
// Enhanced paragraph styling
|
102 |
+
p: ({ children }) => (
|
103 |
+
<p className="mb-3 text-gray-800 leading-6">{children}</p>
|
104 |
+
),
|
105 |
+
// Enhanced heading styling
|
106 |
+
h1: ({ children }) => (
|
107 |
+
<h1 className="text-xl font-bold text-gray-900 mb-3 mt-4 border-b border-gray-200 pb-2">{children}</h1>
|
108 |
+
),
|
109 |
+
h2: ({ children }) => (
|
110 |
+
<h2 className="text-lg font-semibold text-gray-900 mb-2 mt-4">{children}</h2>
|
111 |
+
),
|
112 |
+
h3: ({ children }) => (
|
113 |
+
<h3 className="text-base font-medium text-gray-900 mb-2 mt-3">{children}</h3>
|
114 |
+
),
|
115 |
+
// Enhanced list styling
|
116 |
+
ul: ({ children }) => (
|
117 |
+
<ul className="mb-3 ml-4 space-y-1">{children}</ul>
|
118 |
+
),
|
119 |
+
ol: ({ children }) => (
|
120 |
+
<ol className="mb-3 ml-4 space-y-1 list-decimal">{children}</ol>
|
121 |
+
),
|
122 |
+
li: ({ children }) => (
|
123 |
+
<li className="text-gray-800 leading-6">{children}</li>
|
124 |
+
),
|
125 |
+
// Enhanced code styling
|
126 |
+
code: ({ node, inline, className, children, ...props }) => {
|
127 |
+
return (
|
128 |
+
<code
|
129 |
+
className={`${className} ${
|
130 |
+
inline
|
131 |
+
? 'bg-gray-200 px-1.5 py-0.5 rounded text-sm font-mono text-gray-800'
|
132 |
+
: 'block bg-gray-100 p-3 rounded-md text-sm font-mono text-gray-800 border border-gray-200'
|
133 |
+
}`}
|
134 |
+
{...props}
|
135 |
+
>
|
136 |
+
{children}
|
137 |
+
</code>
|
138 |
+
)
|
139 |
+
},
|
140 |
+
// Enhanced blockquote styling
|
141 |
+
blockquote: ({ children }) => (
|
142 |
+
<blockquote className="border-l-4 border-blue-500 pl-4 py-2 my-3 bg-blue-50 rounded-r-md">
|
143 |
+
<div className="text-gray-700 italic">{children}</div>
|
144 |
+
</blockquote>
|
145 |
+
),
|
146 |
+
// Enhanced table styling
|
147 |
+
table: ({ children }) => (
|
148 |
+
<div className="overflow-x-auto my-4">
|
149 |
+
<table className="min-w-full border border-gray-300 rounded-lg">
|
150 |
+
{children}
|
151 |
+
</table>
|
152 |
+
</div>
|
153 |
+
),
|
154 |
+
th: ({ children }) => (
|
155 |
+
<th className="border border-gray-300 px-3 py-2 bg-gray-100 font-semibold text-gray-900 text-left">
|
156 |
+
{children}
|
157 |
+
</th>
|
158 |
+
),
|
159 |
+
td: ({ children }) => (
|
160 |
+
<td className="border border-gray-300 px-3 py-2 text-gray-800">
|
161 |
+
{children}
|
162 |
+
</td>
|
163 |
+
),
|
164 |
+
// Enhanced strong/bold styling
|
165 |
+
strong: ({ children }) => (
|
166 |
+
<strong className="font-semibold text-gray-900">{children}</strong>
|
167 |
+
),
|
168 |
+
// Enhanced emphasis styling
|
169 |
+
em: ({ children }) => (
|
170 |
+
<em className="italic text-gray-700">{children}</em>
|
171 |
+
),
|
172 |
+
}}
|
173 |
+
>
|
174 |
+
{message.content}
|
175 |
+
</ReactMarkdown>
|
176 |
+
{message.sources && message.sources.length > 0 && (
|
177 |
+
<div className="mt-3 pt-3 border-t border-gray-200">
|
178 |
+
<p className="text-xs font-medium text-gray-600 mb-2 flex items-center">
|
179 |
+
<span className="w-2 h-2 bg-blue-500 rounded-full mr-2"></span>
|
180 |
+
Sources:
|
181 |
+
</p>
|
182 |
+
<div className="space-y-1">
|
183 |
+
{message.sources.map((source, index) => (
|
184 |
+
<p key={index} className="text-xs text-gray-600 bg-gray-50 px-2 py-1 rounded">
|
185 |
+
{source}
|
186 |
+
</p>
|
187 |
+
))}
|
188 |
+
</div>
|
189 |
+
</div>
|
190 |
+
)}
|
191 |
+
</div>
|
192 |
+
{message.type === 'user' && (
|
193 |
+
<User className="w-4 h-4 mt-1 flex-shrink-0" />
|
194 |
+
)}
|
195 |
+
</div>
|
196 |
+
<div className="text-xs opacity-70 mt-1">
|
197 |
+
{formatTimestamp(message.timestamp)}
|
198 |
+
</div>
|
199 |
+
</div>
|
200 |
+
</div>
|
201 |
+
))
|
202 |
+
)}
|
203 |
+
|
204 |
+
{isLoading && (
|
205 |
+
<div className="flex justify-start">
|
206 |
+
<div className="bg-gray-100 rounded-lg px-4 py-2">
|
207 |
+
<div className="flex items-center space-x-2">
|
208 |
+
<Loader2 className="w-4 h-4 animate-spin" />
|
209 |
+
<span className="text-sm text-gray-600">Thinking...</span>
|
210 |
+
</div>
|
211 |
+
</div>
|
212 |
+
</div>
|
213 |
+
)}
|
214 |
+
|
215 |
+
<div ref={messagesEndRef} />
|
216 |
+
</div>
|
217 |
+
|
218 |
+
{/* Input */}
|
219 |
+
<div className="border-t border-gray-200 p-4">
|
220 |
+
<form onSubmit={handleSubmit} className="flex space-x-2">
|
221 |
+
<input
|
222 |
+
type="text"
|
223 |
+
value={input}
|
224 |
+
onChange={(e) => setInput(e.target.value)}
|
225 |
+
placeholder="Ask a question about your documents..."
|
226 |
+
className="flex-1 px-3 py-2 border border-gray-300 rounded-md focus:outline-none focus:ring-2 focus:ring-blue-500 focus:border-transparent"
|
227 |
+
disabled={isLoading}
|
228 |
+
/>
|
229 |
+
<button
|
230 |
+
type="submit"
|
231 |
+
disabled={!input.trim() || isLoading}
|
232 |
+
className="px-4 py-2 bg-blue-600 text-white rounded-md hover:bg-blue-700 focus:outline-none focus:ring-2 focus:ring-blue-500 focus:ring-offset-2 disabled:opacity-50 disabled:cursor-not-allowed"
|
233 |
+
>
|
234 |
+
<Send className="w-4 h-4" />
|
235 |
+
</button>
|
236 |
+
</form>
|
237 |
+
</div>
|
238 |
+
</div>
|
239 |
+
)
|
240 |
+
}
|
frontend/components/DocumentList.tsx
ADDED
@@ -0,0 +1,223 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
'use client'
|
2 |
+
|
3 |
+
import { useState, useEffect } from 'react'
|
4 |
+
import { FileText, Trash2, Calendar, HardDrive, Eye } from 'lucide-react'
|
5 |
+
import { apiService, Document } from '@/lib/api'
|
6 |
+
|
7 |
+
export default function DocumentList({ onDocumentChange }: { onDocumentChange?: () => void }) {
|
8 |
+
const [documents, setDocuments] = useState<Document[]>([])
|
9 |
+
const [loading, setLoading] = useState(true)
|
10 |
+
const [stats, setStats] = useState<any>(null)
|
11 |
+
|
12 |
+
useEffect(() => {
|
13 |
+
loadDocuments()
|
14 |
+
loadStats()
|
15 |
+
}, [])
|
16 |
+
|
17 |
+
const loadDocuments = async () => {
|
18 |
+
try {
|
19 |
+
setLoading(true)
|
20 |
+
const response = await apiService.getDocuments()
|
21 |
+
setDocuments(response.documents)
|
22 |
+
} catch (error) {
|
23 |
+
console.error('Error loading documents:', error)
|
24 |
+
} finally {
|
25 |
+
setLoading(false)
|
26 |
+
}
|
27 |
+
}
|
28 |
+
|
29 |
+
const loadStats = async () => {
|
30 |
+
try {
|
31 |
+
const response = await apiService.getDocumentStats()
|
32 |
+
setStats(response)
|
33 |
+
} catch (error) {
|
34 |
+
console.error('Error loading stats:', error)
|
35 |
+
}
|
36 |
+
}
|
37 |
+
|
38 |
+
const handleDelete = async (id: number) => {
|
39 |
+
try {
|
40 |
+
await apiService.deleteDocument(id)
|
41 |
+
setDocuments(prev => prev.filter(doc => doc.id !== id))
|
42 |
+
loadStats() // Refresh stats
|
43 |
+
loadDocuments() // Refresh document list after delete
|
44 |
+
if (onDocumentChange) onDocumentChange();
|
45 |
+
} catch (error) {
|
46 |
+
console.error('Error deleting document:', error)
|
47 |
+
alert('Failed to delete document')
|
48 |
+
}
|
49 |
+
}
|
50 |
+
|
51 |
+
const formatFileSize = (bytes: number) => {
|
52 |
+
if (bytes === 0) return '0 Bytes'
|
53 |
+
const k = 1024
|
54 |
+
const sizes = ['Bytes', 'KB', 'MB', 'GB']
|
55 |
+
const i = Math.floor(Math.log(bytes) / Math.log(k))
|
56 |
+
return parseFloat((bytes / Math.pow(k, i)).toFixed(2)) + ' ' + sizes[i]
|
57 |
+
}
|
58 |
+
|
59 |
+
const formatDate = (dateString: string) => {
|
60 |
+
return new Date(dateString).toLocaleDateString('en-US', {
|
61 |
+
year: 'numeric',
|
62 |
+
month: 'short',
|
63 |
+
day: 'numeric',
|
64 |
+
hour: '2-digit',
|
65 |
+
minute: '2-digit',
|
66 |
+
})
|
67 |
+
}
|
68 |
+
|
69 |
+
if (loading) {
|
70 |
+
return (
|
71 |
+
<div className="flex items-center justify-center py-8">
|
72 |
+
<div className="animate-spin rounded-full h-8 w-8 border-b-2 border-blue-600"></div>
|
73 |
+
</div>
|
74 |
+
)
|
75 |
+
}
|
76 |
+
|
77 |
+
return (
|
78 |
+
<div className="space-y-6">
|
79 |
+
{/* Stats */}
|
80 |
+
{stats && (
|
81 |
+
<div className="grid grid-cols-1 md:grid-cols-4 gap-4">
|
82 |
+
<div className="bg-white p-4 rounded-lg border border-gray-200">
|
83 |
+
<div className="flex items-center">
|
84 |
+
<FileText className="w-8 h-8 text-blue-600" />
|
85 |
+
<div className="ml-3">
|
86 |
+
<p className="text-sm font-medium text-gray-600">Total Documents</p>
|
87 |
+
<p className="text-2xl font-bold text-gray-900">{stats.total_documents}</p>
|
88 |
+
</div>
|
89 |
+
</div>
|
90 |
+
</div>
|
91 |
+
|
92 |
+
<div className="bg-white p-4 rounded-lg border border-gray-200">
|
93 |
+
<div className="flex items-center">
|
94 |
+
<CheckCircle className="w-8 h-8 text-green-600" />
|
95 |
+
<div className="ml-3">
|
96 |
+
<p className="text-sm font-medium text-gray-600">Processed</p>
|
97 |
+
<p className="text-2xl font-bold text-gray-900">{stats.processed_documents}</p>
|
98 |
+
</div>
|
99 |
+
</div>
|
100 |
+
</div>
|
101 |
+
|
102 |
+
<div className="bg-white p-4 rounded-lg border border-gray-200">
|
103 |
+
<div className="flex items-center">
|
104 |
+
<HardDrive className="w-8 h-8 text-purple-600" />
|
105 |
+
<div className="ml-3">
|
106 |
+
<p className="text-sm font-medium text-gray-600">Total Size</p>
|
107 |
+
<p className="text-2xl font-bold text-gray-900">{stats.total_size_mb} MB</p>
|
108 |
+
</div>
|
109 |
+
</div>
|
110 |
+
</div>
|
111 |
+
|
112 |
+
<div className="bg-white p-4 rounded-lg border border-gray-200">
|
113 |
+
<div className="flex items-center">
|
114 |
+
<Database className="w-8 h-8 text-orange-600" />
|
115 |
+
<div className="ml-3">
|
116 |
+
<p className="text-sm font-medium text-gray-600">Vector Chunks</p>
|
117 |
+
<p className="text-2xl font-bold text-gray-900">{stats.vector_store_chunks}</p>
|
118 |
+
</div>
|
119 |
+
</div>
|
120 |
+
</div>
|
121 |
+
</div>
|
122 |
+
)}
|
123 |
+
|
124 |
+
{/* Documents List */}
|
125 |
+
<div>
|
126 |
+
<h3 className="text-lg font-medium text-gray-900 mb-4">Uploaded Documents</h3>
|
127 |
+
|
128 |
+
{documents.length === 0 ? (
|
129 |
+
<div className="text-center py-8">
|
130 |
+
<FileText className="w-12 h-12 mx-auto text-gray-400 mb-4" />
|
131 |
+
<p className="text-gray-500">No documents uploaded yet</p>
|
132 |
+
<p className="text-sm text-gray-400">Upload your first PDF document to get started</p>
|
133 |
+
</div>
|
134 |
+
) : (
|
135 |
+
<div className="bg-white border border-gray-200 rounded-lg overflow-hidden">
|
136 |
+
<div className="overflow-x-auto">
|
137 |
+
<table className="min-w-full divide-y divide-gray-200">
|
138 |
+
<thead className="bg-gray-50">
|
139 |
+
<tr>
|
140 |
+
<th className="px-6 py-3 text-left text-xs font-medium text-gray-500 uppercase tracking-wider">
|
141 |
+
Document
|
142 |
+
</th>
|
143 |
+
<th className="px-6 py-3 text-left text-xs font-medium text-gray-500 uppercase tracking-wider">
|
144 |
+
Size
|
145 |
+
</th>
|
146 |
+
<th className="px-6 py-3 text-left text-xs font-medium text-gray-500 uppercase tracking-wider">
|
147 |
+
Status
|
148 |
+
</th>
|
149 |
+
<th className="px-6 py-3 text-left text-xs font-medium text-gray-500 uppercase tracking-wider">
|
150 |
+
Uploaded
|
151 |
+
</th>
|
152 |
+
<th className="px-6 py-3 text-right text-xs font-medium text-gray-500 uppercase tracking-wider">
|
153 |
+
Actions
|
154 |
+
</th>
|
155 |
+
</tr>
|
156 |
+
</thead>
|
157 |
+
<tbody className="bg-white divide-y divide-gray-200">
|
158 |
+
{documents.map((document) => (
|
159 |
+
<tr key={document.id} className="hover:bg-gray-50">
|
160 |
+
<td className="px-6 py-4 whitespace-nowrap">
|
161 |
+
<div className="flex items-center">
|
162 |
+
<FileText className="w-5 h-5 text-gray-400 mr-3" />
|
163 |
+
<div>
|
164 |
+
<div className="text-sm font-medium text-gray-900">
|
165 |
+
{document.original_filename}
|
166 |
+
</div>
|
167 |
+
<div className="text-sm text-gray-500">
|
168 |
+
ID: {document.id}
|
169 |
+
</div>
|
170 |
+
</div>
|
171 |
+
</div>
|
172 |
+
</td>
|
173 |
+
<td className="px-6 py-4 whitespace-nowrap text-sm text-gray-900">
|
174 |
+
{formatFileSize(document.file_size)}
|
175 |
+
</td>
|
176 |
+
<td className="px-6 py-4 whitespace-nowrap">
|
177 |
+
<span className={`inline-flex px-2 py-1 text-xs font-semibold rounded-full ${
|
178 |
+
document.processed
|
179 |
+
? 'bg-green-100 text-green-800'
|
180 |
+
: 'bg-yellow-100 text-yellow-800'
|
181 |
+
}`}>
|
182 |
+
{document.processed ? 'Processed' : 'Processing'}
|
183 |
+
</span>
|
184 |
+
</td>
|
185 |
+
<td className="px-6 py-4 whitespace-nowrap text-sm text-gray-500">
|
186 |
+
<div className="flex items-center">
|
187 |
+
<Calendar className="w-4 h-4 mr-1" />
|
188 |
+
{formatDate(document.created_at)}
|
189 |
+
</div>
|
190 |
+
</td>
|
191 |
+
<td className="px-6 py-4 whitespace-nowrap text-right text-sm font-medium">
|
192 |
+
<button
|
193 |
+
onClick={() => handleDelete(document.id)}
|
194 |
+
className="text-red-600 hover:text-red-900 p-1"
|
195 |
+
title="Delete document"
|
196 |
+
>
|
197 |
+
<Trash2 className="w-4 h-4" />
|
198 |
+
</button>
|
199 |
+
</td>
|
200 |
+
</tr>
|
201 |
+
))}
|
202 |
+
</tbody>
|
203 |
+
</table>
|
204 |
+
</div>
|
205 |
+
</div>
|
206 |
+
)}
|
207 |
+
</div>
|
208 |
+
</div>
|
209 |
+
)
|
210 |
+
}
|
211 |
+
|
212 |
+
// Placeholder components for missing icons
|
213 |
+
const CheckCircle = ({ className }: { className?: string }) => (
|
214 |
+
<svg className={className} fill="none" stroke="currentColor" viewBox="0 0 24 24">
|
215 |
+
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M9 12l2 2 4-4m6 2a9 9 0 11-18 0 9 9 0 0118 0z" />
|
216 |
+
</svg>
|
217 |
+
)
|
218 |
+
|
219 |
+
const Database = ({ className }: { className?: string }) => (
|
220 |
+
<svg className={className} fill="none" stroke="currentColor" viewBox="0 0 24 24">
|
221 |
+
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M4 7v10c0 2.21 3.582 4 8 4s8-1.79 8-4V7M4 7c0 2.21 3.582 4 8 4s8-1.79 8-4M4 7c0-2.21 3.582-4 8-4s8 1.79 8 4" />
|
222 |
+
</svg>
|
223 |
+
)
|
frontend/components/DocumentUpload.tsx
ADDED
@@ -0,0 +1,199 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
'use client'
|
2 |
+
|
3 |
+
import { useState, useCallback } from 'react'
|
4 |
+
import { useDropzone } from 'react-dropzone'
|
5 |
+
import { Upload, FileText, X, CheckCircle, AlertCircle } from 'lucide-react'
|
6 |
+
import { apiService } from '@/lib/api'
|
7 |
+
|
8 |
+
interface UploadStatus {
|
9 |
+
file: File
|
10 |
+
status: 'uploading' | 'success' | 'error'
|
11 |
+
message?: string
|
12 |
+
}
|
13 |
+
|
14 |
+
interface DocumentUploadProps {
|
15 |
+
disabled?: boolean
|
16 |
+
onDocumentChange?: () => void
|
17 |
+
}
|
18 |
+
|
19 |
+
export default function DocumentUpload({ disabled, onDocumentChange }: DocumentUploadProps) {
|
20 |
+
const [uploadStatuses, setUploadStatuses] = useState<UploadStatus[]>([])
|
21 |
+
const [infoMessage, setInfoMessage] = useState<string | null>(null)
|
22 |
+
|
23 |
+
const onDrop = useCallback(async (acceptedFiles: File[]) => {
|
24 |
+
if (disabled) return;
|
25 |
+
// Fetch current document count
|
26 |
+
let currentCount = 0;
|
27 |
+
try {
|
28 |
+
const response = await apiService.getDocuments();
|
29 |
+
currentCount = response.documents.length;
|
30 |
+
} catch (e) {
|
31 |
+
currentCount = 0;
|
32 |
+
}
|
33 |
+
// Only allow up to 3 documents total
|
34 |
+
const allowed = Math.max(0, 3 - currentCount);
|
35 |
+
const filesToUpload = acceptedFiles.slice(0, allowed);
|
36 |
+
const ignoredCount = acceptedFiles.length - filesToUpload.length;
|
37 |
+
if (ignoredCount > 0) {
|
38 |
+
setInfoMessage(`Only the first ${allowed} file(s) were uploaded. The rest were ignored to keep the maximum of 3 documents.`);
|
39 |
+
setTimeout(() => setInfoMessage(null), 4000);
|
40 |
+
}
|
41 |
+
if (filesToUpload.length === 0) return;
|
42 |
+
const newUploads = filesToUpload.map(file => ({
|
43 |
+
file,
|
44 |
+
status: 'uploading' as const,
|
45 |
+
}))
|
46 |
+
setUploadStatuses(prev => [...prev, ...newUploads])
|
47 |
+
for (const file of filesToUpload) {
|
48 |
+
try {
|
49 |
+
const response = await apiService.uploadDocument(file)
|
50 |
+
setUploadStatuses(prev =>
|
51 |
+
prev.map(upload =>
|
52 |
+
upload.file === file
|
53 |
+
? { ...upload, status: 'success', message: response.message }
|
54 |
+
: upload
|
55 |
+
)
|
56 |
+
)
|
57 |
+
if (onDocumentChange) onDocumentChange();
|
58 |
+
} catch (error: any) {
|
59 |
+
setUploadStatuses(prev =>
|
60 |
+
prev.map(upload =>
|
61 |
+
upload.file === file
|
62 |
+
? { ...upload, status: 'error', message: error.response?.data?.detail || 'Upload failed' }
|
63 |
+
: upload
|
64 |
+
)
|
65 |
+
)
|
66 |
+
}
|
67 |
+
}
|
68 |
+
}, [disabled, onDocumentChange])
|
69 |
+
|
70 |
+
const { getRootProps, getInputProps, isDragActive } = useDropzone({
|
71 |
+
onDrop,
|
72 |
+
accept: {
|
73 |
+
'application/pdf': ['.pdf']
|
74 |
+
},
|
75 |
+
multiple: true,
|
76 |
+
maxSize: 10 * 1024 * 1024, // 10MB
|
77 |
+
disabled,
|
78 |
+
})
|
79 |
+
|
80 |
+
const removeUpload = async (file: File) => {
|
81 |
+
// If the upload was successful, delete the document from the backend as well
|
82 |
+
const upload = uploadStatuses.find(u => u.file === file)
|
83 |
+
if (upload && upload.status === 'success') {
|
84 |
+
try {
|
85 |
+
// Fetch all documents and find the one with matching original_filename
|
86 |
+
const response = await apiService.getDocuments()
|
87 |
+
const doc = response.documents.find((d: any) => d.original_filename === file.name)
|
88 |
+
if (doc) {
|
89 |
+
await apiService.deleteDocument(doc.id)
|
90 |
+
}
|
91 |
+
} catch (e) {
|
92 |
+
// Optionally show error, but always remove from UI
|
93 |
+
}
|
94 |
+
}
|
95 |
+
setUploadStatuses(prev => prev.filter(upload => upload.file !== file))
|
96 |
+
if (onDocumentChange) onDocumentChange();
|
97 |
+
}
|
98 |
+
|
99 |
+
const formatFileSize = (bytes: number) => {
|
100 |
+
if (bytes === 0) return '0 Bytes'
|
101 |
+
const k = 1024
|
102 |
+
const sizes = ['Bytes', 'KB', 'MB', 'GB']
|
103 |
+
const i = Math.floor(Math.log(bytes) / Math.log(k))
|
104 |
+
return parseFloat((bytes / Math.pow(k, i)).toFixed(2)) + ' ' + sizes[i]
|
105 |
+
}
|
106 |
+
|
107 |
+
return (
|
108 |
+
<div className="space-y-4">
|
109 |
+
<div
|
110 |
+
{...getRootProps()}
|
111 |
+
className={`border-2 border-dashed rounded-lg p-8 text-center transition-colors ${
|
112 |
+
disabled
|
113 |
+
? 'border-gray-200 bg-gray-100 cursor-not-allowed opacity-60'
|
114 |
+
: isDragActive
|
115 |
+
? 'border-blue-500 bg-blue-50 cursor-pointer'
|
116 |
+
: 'border-gray-300 hover:border-gray-400 cursor-pointer'
|
117 |
+
}`}
|
118 |
+
>
|
119 |
+
<input {...getInputProps()} disabled={disabled} />
|
120 |
+
<Upload className="w-12 h-12 mx-auto mb-4 text-gray-400" />
|
121 |
+
{infoMessage && (
|
122 |
+
<p className="text-sm text-yellow-600 mb-2">{infoMessage}</p>
|
123 |
+
)}
|
124 |
+
{disabled ? (
|
125 |
+
<p className="text-lg font-medium text-gray-400">Maximum 3 documents uploaded</p>
|
126 |
+
) : isDragActive ? (
|
127 |
+
<p className="text-lg font-medium text-blue-600">Drop the PDF files here...</p>
|
128 |
+
) : (
|
129 |
+
<div>
|
130 |
+
<p className="text-lg font-medium text-gray-900 mb-2">
|
131 |
+
Upload PDF Documents
|
132 |
+
</p>
|
133 |
+
<p className="text-sm text-gray-600 mb-4">
|
134 |
+
Drag and drop PDF files here, or click to select files
|
135 |
+
</p>
|
136 |
+
<p className="text-xs text-gray-500">
|
137 |
+
Maximum file size: 10MB β’ Supported format: PDF
|
138 |
+
</p>
|
139 |
+
</div>
|
140 |
+
)}
|
141 |
+
</div>
|
142 |
+
|
143 |
+
{/* Upload Status */}
|
144 |
+
{uploadStatuses.length > 0 && (
|
145 |
+
<div className="space-y-2">
|
146 |
+
<h3 className="text-sm font-medium text-gray-900">Upload Status</h3>
|
147 |
+
{uploadStatuses.map((upload, index) => (
|
148 |
+
<div
|
149 |
+
key={index}
|
150 |
+
className="flex items-center justify-between p-3 bg-gray-50 rounded-lg"
|
151 |
+
>
|
152 |
+
<div className="flex items-center space-x-3">
|
153 |
+
<FileText className="w-5 h-5 text-gray-400" />
|
154 |
+
<div>
|
155 |
+
<p className="text-sm font-medium text-gray-900">
|
156 |
+
{upload.file.name}
|
157 |
+
</p>
|
158 |
+
<p className="text-xs text-gray-500">
|
159 |
+
{formatFileSize(upload.file.size)}
|
160 |
+
</p>
|
161 |
+
</div>
|
162 |
+
</div>
|
163 |
+
|
164 |
+
<div className="flex items-center space-x-2">
|
165 |
+
{upload.status === 'uploading' && (
|
166 |
+
<div className="flex items-center space-x-2">
|
167 |
+
<div className="w-4 h-4 border-2 border-blue-500 border-t-transparent rounded-full animate-spin" />
|
168 |
+
<span className="text-sm text-blue-600">Uploading...</span>
|
169 |
+
</div>
|
170 |
+
)}
|
171 |
+
|
172 |
+
{upload.status === 'success' && (
|
173 |
+
<div className="flex items-center space-x-2">
|
174 |
+
<CheckCircle className="w-4 h-4 text-green-500" />
|
175 |
+
<span className="text-sm text-green-600">Success</span>
|
176 |
+
</div>
|
177 |
+
)}
|
178 |
+
|
179 |
+
{upload.status === 'error' && (
|
180 |
+
<div className="flex items-center space-x-2">
|
181 |
+
<AlertCircle className="w-4 h-4 text-red-500" />
|
182 |
+
<span className="text-sm text-red-600">Error</span>
|
183 |
+
</div>
|
184 |
+
)}
|
185 |
+
|
186 |
+
<button
|
187 |
+
onClick={() => removeUpload(upload.file)}
|
188 |
+
className="p-1 hover:bg-gray-200 rounded"
|
189 |
+
>
|
190 |
+
<X className="w-4 h-4 text-gray-400" />
|
191 |
+
</button>
|
192 |
+
</div>
|
193 |
+
</div>
|
194 |
+
))}
|
195 |
+
</div>
|
196 |
+
)}
|
197 |
+
</div>
|
198 |
+
)
|
199 |
+
}
|
frontend/lib/api.ts
ADDED
@@ -0,0 +1,109 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import axios from 'axios'
|
2 |
+
|
3 |
+
const API_BASE_URL = process.env.NEXT_PUBLIC_API_URL || 'http://localhost:8000'
|
4 |
+
|
5 |
+
const api = axios.create({
|
6 |
+
baseURL: API_BASE_URL,
|
7 |
+
headers: {
|
8 |
+
'Content-Type': 'application/json',
|
9 |
+
},
|
10 |
+
})
|
11 |
+
|
12 |
+
export interface Document {
|
13 |
+
id: number
|
14 |
+
filename: string
|
15 |
+
original_filename: string
|
16 |
+
file_size: number
|
17 |
+
content?: string
|
18 |
+
processed: boolean
|
19 |
+
created_at: string
|
20 |
+
updated_at?: string
|
21 |
+
}
|
22 |
+
|
23 |
+
export interface ChatRequest {
|
24 |
+
question: string
|
25 |
+
session_id: string
|
26 |
+
model?: string
|
27 |
+
}
|
28 |
+
|
29 |
+
export interface ChatResponse {
|
30 |
+
success: boolean
|
31 |
+
answer: string
|
32 |
+
model?: string
|
33 |
+
sources: string[]
|
34 |
+
session_id: string
|
35 |
+
message_id?: number
|
36 |
+
}
|
37 |
+
|
38 |
+
export interface UploadResponse {
|
39 |
+
success: boolean
|
40 |
+
document?: Document
|
41 |
+
message: string
|
42 |
+
}
|
43 |
+
|
44 |
+
export const apiService = {
|
45 |
+
// Document endpoints
|
46 |
+
uploadDocument: async (file: File): Promise<UploadResponse> => {
|
47 |
+
const formData = new FormData()
|
48 |
+
formData.append('file', file)
|
49 |
+
|
50 |
+
const response = await api.post('/api/v1/documents/upload', formData, {
|
51 |
+
headers: {
|
52 |
+
'Content-Type': 'multipart/form-data',
|
53 |
+
},
|
54 |
+
})
|
55 |
+
return response.data
|
56 |
+
},
|
57 |
+
|
58 |
+
getDocuments: async (): Promise<{ documents: Document[]; total: number }> => {
|
59 |
+
const response = await api.get('/api/v1/documents/')
|
60 |
+
return response.data
|
61 |
+
},
|
62 |
+
|
63 |
+
deleteDocument: async (id: number): Promise<{ success: boolean; message: string }> => {
|
64 |
+
const response = await api.delete(`/api/v1/documents/${id}`)
|
65 |
+
return response.data
|
66 |
+
},
|
67 |
+
|
68 |
+
getDocumentStats: async (): Promise<any> => {
|
69 |
+
const response = await api.get('/api/v1/documents/stats/summary')
|
70 |
+
return response.data
|
71 |
+
},
|
72 |
+
|
73 |
+
// Chat endpoints
|
74 |
+
sendMessage: async (request: ChatRequest): Promise<ChatResponse> => {
|
75 |
+
const response = await api.post('/api/v1/chat/', request)
|
76 |
+
return response.data
|
77 |
+
},
|
78 |
+
|
79 |
+
getChatHistory: async (sessionId: string): Promise<any> => {
|
80 |
+
const response = await api.get(`/api/v1/chat/history/${sessionId}`)
|
81 |
+
return response.data
|
82 |
+
},
|
83 |
+
|
84 |
+
createSession: async (): Promise<{ session_id: string }> => {
|
85 |
+
const response = await api.post('/api/v1/chat/session/new')
|
86 |
+
return response.data
|
87 |
+
},
|
88 |
+
|
89 |
+
getSessions: async (): Promise<any[]> => {
|
90 |
+
const response = await api.get('/api/v1/chat/sessions')
|
91 |
+
return response.data
|
92 |
+
},
|
93 |
+
|
94 |
+
deleteSession: async (sessionId: string): Promise<{ success: boolean; message: string }> => {
|
95 |
+
const response = await api.delete(`/api/v1/chat/session/${sessionId}`)
|
96 |
+
return response.data
|
97 |
+
},
|
98 |
+
|
99 |
+
getAvailableModels: async (): Promise<{ available_models: string[]; is_configured: boolean }> => {
|
100 |
+
const response = await api.get('/api/v1/chat/models/available')
|
101 |
+
return response.data
|
102 |
+
},
|
103 |
+
|
104 |
+
// Health check
|
105 |
+
healthCheck: async (): Promise<any> => {
|
106 |
+
const response = await api.get('/health')
|
107 |
+
return response.data
|
108 |
+
},
|
109 |
+
}
|
frontend/lib/store.ts
ADDED
@@ -0,0 +1,50 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import { create } from 'zustand'
|
2 |
+
import { v4 as uuidv4 } from 'uuid'
|
3 |
+
|
4 |
+
export interface ChatMessage {
|
5 |
+
id: string
|
6 |
+
content: string
|
7 |
+
type: 'user' | 'assistant'
|
8 |
+
timestamp: Date
|
9 |
+
sources?: string[]
|
10 |
+
}
|
11 |
+
|
12 |
+
interface ChatStore {
|
13 |
+
sessionId: string | null
|
14 |
+
messages: ChatMessage[]
|
15 |
+
isLoading: boolean
|
16 |
+
createNewSession: () => void
|
17 |
+
addMessage: (message: Omit<ChatMessage, 'id' | 'timestamp'>) => void
|
18 |
+
setLoading: (loading: boolean) => void
|
19 |
+
clearMessages: () => void
|
20 |
+
}
|
21 |
+
|
22 |
+
export const useChatStore = create<ChatStore>((set, get) => ({
|
23 |
+
sessionId: null,
|
24 |
+
messages: [],
|
25 |
+
isLoading: false,
|
26 |
+
|
27 |
+
createNewSession: () => {
|
28 |
+
const sessionId = uuidv4()
|
29 |
+
set({ sessionId, messages: [] })
|
30 |
+
},
|
31 |
+
|
32 |
+
addMessage: (message) => {
|
33 |
+
const newMessage: ChatMessage = {
|
34 |
+
...message,
|
35 |
+
id: uuidv4(),
|
36 |
+
timestamp: new Date(),
|
37 |
+
}
|
38 |
+
set((state) => ({
|
39 |
+
messages: [...state.messages, newMessage]
|
40 |
+
}))
|
41 |
+
},
|
42 |
+
|
43 |
+
setLoading: (loading) => {
|
44 |
+
set({ isLoading: loading })
|
45 |
+
},
|
46 |
+
|
47 |
+
clearMessages: () => {
|
48 |
+
set({ messages: [] })
|
49 |
+
},
|
50 |
+
}))
|
frontend/next-env.d.ts
ADDED
@@ -0,0 +1,5 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
/// <reference types="next" />
|
2 |
+
/// <reference types="next/image-types/global" />
|
3 |
+
|
4 |
+
// NOTE: This file should not be edited
|
5 |
+
// see https://nextjs.org/docs/basic-features/typescript for more information.
|
frontend/next.config.js
ADDED
@@ -0,0 +1,19 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
/** @type {import('next').NextConfig} */
|
2 |
+
const nextConfig = {
|
3 |
+
experimental: {
|
4 |
+
appDir: true,
|
5 |
+
},
|
6 |
+
images: {
|
7 |
+
domains: ['localhost'],
|
8 |
+
},
|
9 |
+
async rewrites() {
|
10 |
+
return [
|
11 |
+
{
|
12 |
+
source: '/api/:path*',
|
13 |
+
destination: 'http://localhost:8000/api/:path*',
|
14 |
+
},
|
15 |
+
];
|
16 |
+
},
|
17 |
+
};
|
18 |
+
|
19 |
+
module.exports = nextConfig;
|
frontend/package-lock.json
ADDED
The diff for this file is too large to render.
See raw diff
|
|
frontend/package.json
ADDED
@@ -0,0 +1,44 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"name": "pdf-chatbot-frontend",
|
3 |
+
"version": "0.1.0",
|
4 |
+
"private": true,
|
5 |
+
"scripts": {
|
6 |
+
"dev": "next dev",
|
7 |
+
"build": "next build",
|
8 |
+
"start": "next start",
|
9 |
+
"lint": "next lint"
|
10 |
+
},
|
11 |
+
"dependencies": {
|
12 |
+
"next": "14.0.4",
|
13 |
+
"react": "^18",
|
14 |
+
"react-dom": "^18",
|
15 |
+
"@types/node": "^20",
|
16 |
+
"@types/react": "^18",
|
17 |
+
"@types/react-dom": "^18",
|
18 |
+
"typescript": "^5",
|
19 |
+
"tailwindcss": "^3.3.0",
|
20 |
+
"autoprefixer": "^10.0.1",
|
21 |
+
"postcss": "^8",
|
22 |
+
"lucide-react": "^0.294.0",
|
23 |
+
"class-variance-authority": "^0.7.0",
|
24 |
+
"clsx": "^2.0.0",
|
25 |
+
"tailwind-merge": "^2.0.0",
|
26 |
+
"zustand": "^4.4.7",
|
27 |
+
"react-hook-form": "^7.48.2",
|
28 |
+
"@hookform/resolvers": "^3.3.2",
|
29 |
+
"zod": "^3.22.4",
|
30 |
+
"axios": "^1.6.2",
|
31 |
+
"react-dropzone": "^14.2.3",
|
32 |
+
"react-markdown": "^9.0.1",
|
33 |
+
"remark-gfm": "^4.0.0",
|
34 |
+
"react-syntax-highlighter": "^15.5.0",
|
35 |
+
"@types/react-syntax-highlighter": "^15.5.11",
|
36 |
+
"uuid": "^9.0.1",
|
37 |
+
"@types/uuid": "^9.0.7",
|
38 |
+
"tailwindcss-animate": "^1.0.7"
|
39 |
+
},
|
40 |
+
"devDependencies": {
|
41 |
+
"eslint": "^8",
|
42 |
+
"eslint-config-next": "14.0.4"
|
43 |
+
}
|
44 |
+
}
|
frontend/postcss.config.js
ADDED
@@ -0,0 +1,6 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
module.exports = {
|
2 |
+
plugins: {
|
3 |
+
tailwindcss: {},
|
4 |
+
autoprefixer: {},
|
5 |
+
},
|
6 |
+
}
|
frontend/tailwind.config.js
ADDED
@@ -0,0 +1,76 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
/** @type {import('tailwindcss').Config} */
|
2 |
+
module.exports = {
|
3 |
+
darkMode: ["class"],
|
4 |
+
content: [
|
5 |
+
'./pages/**/*.{ts,tsx}',
|
6 |
+
'./components/**/*.{ts,tsx}',
|
7 |
+
'./app/**/*.{ts,tsx}',
|
8 |
+
'./src/**/*.{ts,tsx}',
|
9 |
+
],
|
10 |
+
theme: {
|
11 |
+
container: {
|
12 |
+
center: true,
|
13 |
+
padding: "2rem",
|
14 |
+
screens: {
|
15 |
+
"2xl": "1400px",
|
16 |
+
},
|
17 |
+
},
|
18 |
+
extend: {
|
19 |
+
colors: {
|
20 |
+
border: "hsl(var(--border))",
|
21 |
+
input: "hsl(var(--input))",
|
22 |
+
ring: "hsl(var(--ring))",
|
23 |
+
background: "hsl(var(--background))",
|
24 |
+
foreground: "hsl(var(--foreground))",
|
25 |
+
primary: {
|
26 |
+
DEFAULT: "hsl(var(--primary))",
|
27 |
+
foreground: "hsl(var(--primary-foreground))",
|
28 |
+
},
|
29 |
+
secondary: {
|
30 |
+
DEFAULT: "hsl(var(--secondary))",
|
31 |
+
foreground: "hsl(var(--secondary-foreground))",
|
32 |
+
},
|
33 |
+
destructive: {
|
34 |
+
DEFAULT: "hsl(var(--destructive))",
|
35 |
+
foreground: "hsl(var(--destructive-foreground))",
|
36 |
+
},
|
37 |
+
muted: {
|
38 |
+
DEFAULT: "hsl(var(--muted))",
|
39 |
+
foreground: "hsl(var(--muted-foreground))",
|
40 |
+
},
|
41 |
+
accent: {
|
42 |
+
DEFAULT: "hsl(var(--accent))",
|
43 |
+
foreground: "hsl(var(--accent-foreground))",
|
44 |
+
},
|
45 |
+
popover: {
|
46 |
+
DEFAULT: "hsl(var(--popover))",
|
47 |
+
foreground: "hsl(var(--popover-foreground))",
|
48 |
+
},
|
49 |
+
card: {
|
50 |
+
DEFAULT: "hsl(var(--card))",
|
51 |
+
foreground: "hsl(var(--card-foreground))",
|
52 |
+
},
|
53 |
+
},
|
54 |
+
borderRadius: {
|
55 |
+
lg: "var(--radius)",
|
56 |
+
md: "calc(var(--radius) - 2px)",
|
57 |
+
sm: "calc(var(--radius) - 4px)",
|
58 |
+
},
|
59 |
+
keyframes: {
|
60 |
+
"accordion-down": {
|
61 |
+
from: { height: 0 },
|
62 |
+
to: { height: "var(--radix-accordion-content-height)" },
|
63 |
+
},
|
64 |
+
"accordion-up": {
|
65 |
+
from: { height: "var(--radix-accordion-content-height)" },
|
66 |
+
to: { height: 0 },
|
67 |
+
},
|
68 |
+
},
|
69 |
+
animation: {
|
70 |
+
"accordion-down": "accordion-down 0.2s ease-out",
|
71 |
+
"accordion-up": "accordion-up 0.2s ease-out",
|
72 |
+
},
|
73 |
+
},
|
74 |
+
},
|
75 |
+
plugins: [require("tailwindcss-animate")],
|
76 |
+
}
|
frontend/tsconfig.json
ADDED
@@ -0,0 +1,28 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"compilerOptions": {
|
3 |
+
"target": "es5",
|
4 |
+
"lib": ["dom", "dom.iterable", "es6"],
|
5 |
+
"allowJs": true,
|
6 |
+
"skipLibCheck": true,
|
7 |
+
"strict": true,
|
8 |
+
"noEmit": true,
|
9 |
+
"esModuleInterop": true,
|
10 |
+
"module": "esnext",
|
11 |
+
"moduleResolution": "bundler",
|
12 |
+
"resolveJsonModule": true,
|
13 |
+
"isolatedModules": true,
|
14 |
+
"jsx": "preserve",
|
15 |
+
"incremental": true,
|
16 |
+
"plugins": [
|
17 |
+
{
|
18 |
+
"name": "next"
|
19 |
+
}
|
20 |
+
],
|
21 |
+
"baseUrl": ".",
|
22 |
+
"paths": {
|
23 |
+
"@/*": ["./*"]
|
24 |
+
}
|
25 |
+
},
|
26 |
+
"include": ["next-env.d.ts", "**/*.ts", "**/*.tsx", ".next/types/**/*.ts"],
|
27 |
+
"exclude": ["node_modules"]
|
28 |
+
}
|
setup.ps1
ADDED
@@ -0,0 +1,96 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# PowerShell setup script for PDF Q&A Chatbot System
|
2 |
+
|
3 |
+
Write-Host "π Setting up PDF Q&A Chatbot System..." -ForegroundColor Green
|
4 |
+
|
5 |
+
# Check if Python is installed
|
6 |
+
try {
|
7 |
+
$pythonVersion = python --version 2>&1
|
8 |
+
Write-Host "β
Python found: $pythonVersion" -ForegroundColor Green
|
9 |
+
} catch {
|
10 |
+
Write-Host "β Python is required but not installed. Please install Python 3.8+ and try again." -ForegroundColor Red
|
11 |
+
exit 1
|
12 |
+
}
|
13 |
+
|
14 |
+
# Check if Node.js is installed
|
15 |
+
try {
|
16 |
+
$nodeVersion = node --version 2>&1
|
17 |
+
Write-Host "β
Node.js found: $nodeVersion" -ForegroundColor Green
|
18 |
+
} catch {
|
19 |
+
Write-Host "β Node.js is required but not installed. Please install Node.js 18+ and try again." -ForegroundColor Red
|
20 |
+
exit 1
|
21 |
+
}
|
22 |
+
|
23 |
+
# Check if npm is installed
|
24 |
+
try {
|
25 |
+
$npmVersion = npm --version 2>&1
|
26 |
+
Write-Host "β
npm found: $npmVersion" -ForegroundColor Green
|
27 |
+
} catch {
|
28 |
+
Write-Host "β npm is required but not installed. Please install npm and try again." -ForegroundColor Red
|
29 |
+
exit 1
|
30 |
+
}
|
31 |
+
|
32 |
+
Write-Host "β
Prerequisites check passed" -ForegroundColor Green
|
33 |
+
|
34 |
+
# Backend setup
|
35 |
+
Write-Host "π¦ Setting up backend..." -ForegroundColor Yellow
|
36 |
+
Set-Location backend
|
37 |
+
|
38 |
+
# Create virtual environment
|
39 |
+
Write-Host "Creating Python virtual environment..." -ForegroundColor Yellow
|
40 |
+
python -m venv venv
|
41 |
+
|
42 |
+
# Activate virtual environment
|
43 |
+
Write-Host "Activating virtual environment..." -ForegroundColor Yellow
|
44 |
+
.\venv\Scripts\Activate.ps1
|
45 |
+
|
46 |
+
# Install dependencies
|
47 |
+
Write-Host "Installing Python dependencies..." -ForegroundColor Yellow
|
48 |
+
pip install -r requirements.txt
|
49 |
+
|
50 |
+
# Create .env file if it doesn't exist
|
51 |
+
if (-not (Test-Path .env)) {
|
52 |
+
Write-Host "Creating .env file..." -ForegroundColor Yellow
|
53 |
+
Copy-Item .env.example .env
|
54 |
+
Write-Host "β οΈ Please edit backend/.env and add your API keys (OpenAI or Anthropic)" -ForegroundColor Yellow
|
55 |
+
}
|
56 |
+
|
57 |
+
Set-Location ..
|
58 |
+
|
59 |
+
# Frontend setup
|
60 |
+
Write-Host "π¦ Setting up frontend..." -ForegroundColor Yellow
|
61 |
+
Set-Location frontend
|
62 |
+
|
63 |
+
# Install dependencies
|
64 |
+
Write-Host "Installing Node.js dependencies..." -ForegroundColor Yellow
|
65 |
+
npm install
|
66 |
+
|
67 |
+
# Create .env file if it doesn't exist
|
68 |
+
if (-not (Test-Path .env)) {
|
69 |
+
Write-Host "Creating .env file..." -ForegroundColor Yellow
|
70 |
+
Copy-Item .env.example .env
|
71 |
+
}
|
72 |
+
|
73 |
+
Set-Location ..
|
74 |
+
|
75 |
+
Write-Host ""
|
76 |
+
Write-Host "π Setup completed successfully!" -ForegroundColor Green
|
77 |
+
Write-Host ""
|
78 |
+
Write-Host "π Next steps:" -ForegroundColor Cyan
|
79 |
+
Write-Host "1. Edit backend/.env and add your API keys:" -ForegroundColor White
|
80 |
+
Write-Host " - OPENAI_API_KEY or ANTHROPIC_API_KEY" -ForegroundColor White
|
81 |
+
Write-Host ""
|
82 |
+
Write-Host "2. Start the backend server:" -ForegroundColor White
|
83 |
+
Write-Host " cd backend" -ForegroundColor White
|
84 |
+
Write-Host " .\venv\Scripts\Activate.ps1" -ForegroundColor White
|
85 |
+
Write-Host " uvicorn main:app --reload" -ForegroundColor White
|
86 |
+
Write-Host ""
|
87 |
+
Write-Host "3. Start the frontend server (in a new terminal):" -ForegroundColor White
|
88 |
+
Write-Host " cd frontend" -ForegroundColor White
|
89 |
+
Write-Host " npm run dev" -ForegroundColor White
|
90 |
+
Write-Host ""
|
91 |
+
Write-Host "4. Open your browser and go to: http://localhost:3000" -ForegroundColor White
|
92 |
+
Write-Host ""
|
93 |
+
Write-Host "π³ Alternatively, you can use Docker:" -ForegroundColor White
|
94 |
+
Write-Host " docker-compose up --build" -ForegroundColor White
|
95 |
+
Write-Host ""
|
96 |
+
Write-Host "π For more information, see the README.md file" -ForegroundColor White
|
setup.sh
ADDED
@@ -0,0 +1,90 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/bin/bash
|
2 |
+
|
3 |
+
echo "π Setting up PDF Q&A Chatbot System..."
|
4 |
+
|
5 |
+
# Check if Python is installed
|
6 |
+
if ! command -v python3 &> /dev/null; then
|
7 |
+
echo "β Python 3 is required but not installed. Please install Python 3.8+ and try again."
|
8 |
+
exit 1
|
9 |
+
fi
|
10 |
+
|
11 |
+
# Check if Node.js is installed
|
12 |
+
if ! command -v node &> /dev/null; then
|
13 |
+
echo "β Node.js is required but not installed. Please install Node.js 18+ and try again."
|
14 |
+
exit 1
|
15 |
+
fi
|
16 |
+
|
17 |
+
# Check if npm is installed
|
18 |
+
if ! command -v npm &> /dev/null; then
|
19 |
+
echo "β npm is required but not installed. Please install npm and try again."
|
20 |
+
exit 1
|
21 |
+
fi
|
22 |
+
|
23 |
+
echo "β
Prerequisites check passed"
|
24 |
+
|
25 |
+
# Backend setup
|
26 |
+
echo "π¦ Setting up backend..."
|
27 |
+
cd backend
|
28 |
+
|
29 |
+
# Create virtual environment
|
30 |
+
echo "Creating Python virtual environment..."
|
31 |
+
python3 -m venv venv
|
32 |
+
|
33 |
+
# Activate virtual environment
|
34 |
+
if [[ "$OSTYPE" == "msys" || "$OSTYPE" == "win32" ]]; then
|
35 |
+
source venv/Scripts/activate
|
36 |
+
else
|
37 |
+
source venv/bin/activate
|
38 |
+
fi
|
39 |
+
|
40 |
+
# Install dependencies
|
41 |
+
echo "Installing Python dependencies..."
|
42 |
+
pip install -r requirements.txt
|
43 |
+
|
44 |
+
# Create .env file if it doesn't exist
|
45 |
+
if [ ! -f .env ]; then
|
46 |
+
echo "Creating .env file..."
|
47 |
+
cp .env.example .env
|
48 |
+
echo "β οΈ Please edit backend/.env and add your API keys (OpenAI or Anthropic)"
|
49 |
+
fi
|
50 |
+
|
51 |
+
cd ..
|
52 |
+
|
53 |
+
# Frontend setup
|
54 |
+
echo "π¦ Setting up frontend..."
|
55 |
+
cd frontend
|
56 |
+
|
57 |
+
# Install dependencies
|
58 |
+
echo "Installing Node.js dependencies..."
|
59 |
+
npm install
|
60 |
+
|
61 |
+
# Create .env file if it doesn't exist
|
62 |
+
if [ ! -f .env ]; then
|
63 |
+
echo "Creating .env file..."
|
64 |
+
cp .env.example .env
|
65 |
+
fi
|
66 |
+
|
67 |
+
cd ..
|
68 |
+
|
69 |
+
echo ""
|
70 |
+
echo "π Setup completed successfully!"
|
71 |
+
echo ""
|
72 |
+
echo "π Next steps:"
|
73 |
+
echo "1. Edit backend/.env and add your API keys:"
|
74 |
+
echo " - OPENAI_API_KEY or ANTHROPIC_API_KEY"
|
75 |
+
echo ""
|
76 |
+
echo "2. Start the backend server:"
|
77 |
+
echo " cd backend"
|
78 |
+
echo " source venv/bin/activate # On Windows: venv\\Scripts\\activate"
|
79 |
+
echo " uvicorn main:app --reload"
|
80 |
+
echo ""
|
81 |
+
echo "3. Start the frontend server (in a new terminal):"
|
82 |
+
echo " cd frontend"
|
83 |
+
echo " npm run dev"
|
84 |
+
echo ""
|
85 |
+
echo "4. Open your browser and go to: http://localhost:3000"
|
86 |
+
echo ""
|
87 |
+
echo "π³ Alternatively, you can use Docker:"
|
88 |
+
echo " docker-compose up --build"
|
89 |
+
echo ""
|
90 |
+
echo "π For more information, see the README.md file"
|