Paul Magee commited on
Commit
c05de71
·
1 Parent(s): e3be075

Comprehensive UI/UX and Architecture Improvements

Browse files

UI Improvements:
Added a "Clear Chat" button in the sidebar that remains accessible regardless of scroll position
Styled the button to match the application's dark theme
Fixed button styling to ensure text appears on a single line
Fixed hover jolt issues by stabilizing collapse button position
Set sidebar to be collapsed by default for cleaner initial UI
Removed the "Deploy" button and other unnecessary Streamlit elements
Changed layout from "wide" to "centered" for better readability
Architecture Improvements:
Renamed and restructured files to better reflect their purposes (chatbot.py → backend.py, app.py → frontend.py)
Created a separate admin.py interface for administrative functions
Implemented proper authentication in the admin interface
Improved configuration management with centralized config handling
Enhanced error handling and logging throughout the application
Fixed URL parameter handling to use current Streamlit API (removing deprecated methods)
Functionality Enhancements:
Implemented document reindexing capabilities in the admin interface
Added proper session state management for chat history
Improved document loading and indexing processes
Added better documentation in code comments and markup
Optimized LlamaIndex integration for more efficient document retrieval
Enhanced error recovery mechanisms
Code Quality:
Refactored for better code organization and maintainability
Updated deprecated API calls to current versions
Improved exception handling across the application
Added proper logging for important events and user actions
Created cleaner separation between frontend and backend logic

This view is limited to 50 files because it contains too many changes.   See raw diff
.gitignore CHANGED
@@ -4,7 +4,6 @@ __pycache__/
4
  *$py.class
5
  *.so
6
  .Python
7
- env/
8
  build/
9
  develop-eggs/
10
  dist/
@@ -21,22 +20,56 @@ wheels/
21
  .installed.cfg
22
  *.egg
23
 
24
- # Virtual Environment
 
 
 
25
  venv/
26
  ENV/
27
- myenv/
28
 
29
- # IDE
 
 
 
 
 
 
 
 
 
 
 
30
  .idea/
31
  .vscode/
32
  *.swp
33
  *.swo
34
-
35
- # Project specific
36
- .env
37
- index/
38
- cache/
39
  .DS_Store
40
 
41
- # Logs
42
- *.log
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  *$py.class
5
  *.so
6
  .Python
 
7
  build/
8
  develop-eggs/
9
  dist/
 
20
  .installed.cfg
21
  *.egg
22
 
23
+ # Environment variables
24
+ .env
25
+ .venv
26
+ env/
27
  venv/
28
  ENV/
 
29
 
30
+ # Streamlit secrets
31
+ .streamlit/secrets.toml
32
+
33
+ # Data and indexes
34
+ index/
35
+ data/
36
+
37
+ # Logs
38
+ logs/
39
+ *.log
40
+
41
+ # IDE files
42
  .idea/
43
  .vscode/
44
  *.swp
45
  *.swo
 
 
 
 
 
46
  .DS_Store
47
 
48
+ # Testing
49
+ .pytest_cache/
50
+ .coverage
51
+ htmlcov/
52
+ .tox/
53
+ .nox/
54
+ .hypothesis/
55
+
56
+ # Docker
57
+ .docker/
58
+
59
+ # Distribution / packaging
60
+ .Python
61
+ env/
62
+ build/
63
+ develop-eggs/
64
+ dist/
65
+ downloads/
66
+ eggs/
67
+ .eggs/
68
+ lib/
69
+ lib64/
70
+ parts/
71
+ sdist/
72
+ var/
73
+ *.egg-info/
74
+ .installed.cfg
75
+ *.egg
DEPLOYMENT.md ADDED
@@ -0,0 +1,298 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Deployment Guide for Document Chatbot
2
+
3
+ This guide outlines the steps to deploy the Document Chatbot application to production using Docker and LlamaDeploy.
4
+
5
+ ## Deployment Options
6
+
7
+ The Document Chatbot can be deployed in several ways:
8
+
9
+ 1. **Local Development**: Running directly with Streamlit (as described in the README)
10
+ 2. **Docker Containerization**: Packaging the application in Docker containers
11
+ 3. **LlamaDeploy**: Using LlamaIndex's deployment framework for production-grade hosting
12
+
13
+ This guide focuses on options 2 and 3, which are most suitable for production environments.
14
+
15
+ ## Prerequisites
16
+
17
+ - Docker and Docker Compose installed
18
+ - GitHub account (for version control)
19
+ - Anthropic API key
20
+ - Basic familiarity with command line tools
21
+
22
+ ## Step 1: Setting Up Version Control with GitHub
23
+
24
+ 1. Initialize a Git repository and commit your code:
25
+ ```bash
26
+ git init
27
+ git add .
28
+ git commit -m "Initial commit"
29
+ ```
30
+
31
+ 2. Create a GitHub repository and push your code:
32
+ ```bash
33
+ git remote add origin <your-github-repo-url>
34
+ git push -u origin main
35
+ ```
36
+
37
+ ## Step 2: Dockerizing the Application
38
+
39
+ ### Create Dockerfile
40
+
41
+ Create a `Dockerfile` in the project root:
42
+
43
+ ```dockerfile
44
+ FROM python:3.9-slim
45
+
46
+ WORKDIR /app
47
+
48
+ # Copy requirements first for better caching
49
+ COPY requirements.txt .
50
+ RUN pip install --no-cache-dir -r requirements.txt
51
+
52
+ # Copy the rest of the application
53
+ COPY . .
54
+
55
+ # Create directories if they don't exist
56
+ RUN mkdir -p data index logs
57
+
58
+ # Expose the port Streamlit will run on
59
+ EXPOSE 8501
60
+
61
+ # Set environment variables
62
+ ENV PYTHONUNBUFFERED=1
63
+
64
+ # Set up entry point
65
+ ENTRYPOINT ["streamlit", "run"]
66
+ CMD ["frontend.py"]
67
+ ```
68
+
69
+ ### Create Docker Compose File
70
+
71
+ Create a `docker-compose.yml` file for managing multiple services:
72
+
73
+ ```yaml
74
+ version: '3'
75
+
76
+ services:
77
+ chatbot-frontend:
78
+ build: .
79
+ command: frontend.py
80
+ ports:
81
+ - "8501:8501"
82
+ volumes:
83
+ - ./data:/app/data
84
+ - ./index:/app/index
85
+ - ./logs:/app/logs
86
+ env_file:
87
+ - .env
88
+ restart: unless-stopped
89
+
90
+ chatbot-admin:
91
+ build: .
92
+ command: admin.py
93
+ ports:
94
+ - "8502:8501"
95
+ volumes:
96
+ - ./data:/app/data
97
+ - ./index:/app/index
98
+ - ./logs:/app/logs
99
+ env_file:
100
+ - .env
101
+ restart: unless-stopped
102
+ ```
103
+
104
+ ### Create .dockerignore File
105
+
106
+ Create a `.dockerignore` file to exclude unnecessary files:
107
+
108
+ ```
109
+ .git
110
+ .gitignore
111
+ .env
112
+ __pycache__/
113
+ *.py[cod]
114
+ *$py.class
115
+ .Python
116
+ *.so
117
+ .pytest_cache/
118
+ .coverage
119
+ htmlcov/
120
+ .tox/
121
+ .nox/
122
+ .hypothesis/
123
+ .idea/
124
+ .vscode/
125
+ ```
126
+
127
+ ## Step 3: Setting Up Secrets Management
128
+
129
+ 1. Create a `.env.example` file with placeholders for required environment variables:
130
+ ```
131
+ ANTHROPIC_API_KEY=your_api_key_here
132
+ ```
133
+
134
+ 2. Ensure the actual `.env` file with real secrets is in `.gitignore`
135
+
136
+ 3. For Streamlit admin secrets, create a `.streamlit/secrets.toml.example`:
137
+ ```toml
138
+ admin_password = "example_password"
139
+ ```
140
+
141
+ ## Step 4: Deployment with Docker Compose
142
+
143
+ 1. Build and start the containers:
144
+ ```bash
145
+ docker-compose up -d
146
+ ```
147
+
148
+ 2. Access the applications:
149
+ - User interface: http://localhost:8501
150
+ - Admin interface: http://localhost:8502
151
+
152
+ ## Step 5: Production Deployment with LlamaDeploy
153
+
154
+ For a production-grade deployment using LlamaDeploy, follow these steps:
155
+
156
+ 1. Install LlamaDeploy:
157
+ ```bash
158
+ pip install llama_deploy
159
+ ```
160
+
161
+ 2. Create a deployment script `deploy_llamadeploy.py`:
162
+
163
+ ```python
164
+ from llama_deploy import LlamaDeploySDK
165
+ from llama_deploy.types import WorkflowDefinition
166
+ from llama_index.core import Settings
167
+ from llama_index.llms.anthropic import Anthropic
168
+ from backend import Chatbot
169
+ import os
170
+
171
+ # Get API key from environment
172
+ api_key = os.environ.get("ANTHROPIC_API_KEY")
173
+ if not api_key:
174
+ raise ValueError("ANTHROPIC_API_KEY environment variable is required")
175
+
176
+ # Initialize the Chatbot
177
+ chatbot = Chatbot()
178
+
179
+ # Define a workflow for the query function
180
+ def query_workflow(query_text: str) -> str:
181
+ """Execute a query against the chatbot backend."""
182
+ response = chatbot.query(query_text)
183
+ return response
184
+
185
+ # Define a workflow for reindexing
186
+ def reindex_workflow() -> str:
187
+ """Update the index with new documents."""
188
+ documents = chatbot.load_documents()
189
+ chatbot.update_index(documents)
190
+ return "Index updated successfully"
191
+
192
+ # Initialize LlamaDeploy SDK
193
+ sdk = LlamaDeploySDK()
194
+
195
+ # Create workflow definitions
196
+ query_definition = WorkflowDefinition(
197
+ name="document_query",
198
+ description="Query documents using the chatbot",
199
+ workflow_fn=query_workflow
200
+ )
201
+
202
+ reindex_definition = WorkflowDefinition(
203
+ name="reindex_documents",
204
+ description="Update the index with new documents",
205
+ workflow_fn=reindex_workflow
206
+ )
207
+
208
+ # Deploy workflows
209
+ query_deployment = sdk.create_deployment(query_definition)
210
+ reindex_deployment = sdk.create_deployment(reindex_definition)
211
+
212
+ print(f"Query API endpoint: {query_deployment.endpoint}")
213
+ print(f"Reindex API endpoint: {reindex_deployment.endpoint}")
214
+ ```
215
+
216
+ 3. Deploy your workflows:
217
+ ```bash
218
+ python deploy_llamadeploy.py
219
+ ```
220
+
221
+ 4. Create a production-ready frontend that connects to the LlamaDeploy endpoints.
222
+
223
+ ## Step 6: Continuous Integration and Deployment (CI/CD)
224
+
225
+ Set up GitHub Actions for CI/CD by creating `.github/workflows/deploy.yml`:
226
+
227
+ ```yaml
228
+ name: Deploy Document Chatbot
229
+
230
+ on:
231
+ push:
232
+ branches: [ main ]
233
+
234
+ jobs:
235
+ test:
236
+ runs-on: ubuntu-latest
237
+ steps:
238
+ - uses: actions/checkout@v3
239
+ - name: Set up Python
240
+ uses: actions/setup-python@v4
241
+ with:
242
+ python-version: '3.9'
243
+ - name: Install dependencies
244
+ run: |
245
+ python -m pip install --upgrade pip
246
+ pip install -r requirements.txt
247
+ - name: Run tests
248
+ run: |
249
+ pytest
250
+
251
+ build-and-push:
252
+ needs: test
253
+ runs-on: ubuntu-latest
254
+ steps:
255
+ - uses: actions/checkout@v3
256
+ - name: Set up Docker Buildx
257
+ uses: docker/setup-buildx-action@v2
258
+ - name: Login to DockerHub
259
+ uses: docker/login-action@v2
260
+ with:
261
+ username: ${{ secrets.DOCKERHUB_USERNAME }}
262
+ password: ${{ secrets.DOCKERHUB_TOKEN }}
263
+ - name: Build and push
264
+ uses: docker/build-push-action@v4
265
+ with:
266
+ push: true
267
+ tags: yourusername/document-chatbot:latest
268
+ ```
269
+
270
+ ## Monitoring and Scaling
271
+
272
+ 1. **Monitoring**: Set up logging to a service like CloudWatch, DataDog, or Grafana
273
+ 2. **Scaling**: Consider using Kubernetes for production deployments with multiple replicas
274
+ 3. **Database-Backed Indexes**: For better persistence and scalability, consider switching to a database-backed vector store like PostgreSQL with pgvector
275
+
276
+ ## Security Considerations
277
+
278
+ 1. Use HTTPS for all production deployments
279
+ 2. Implement proper authentication using OAuth or similar
280
+ 3. Regularly update dependencies and scan for security vulnerabilities
281
+ 4. Restrict access to the admin interface through network controls
282
+
283
+ ## Cost Management
284
+
285
+ 1. Monitor API usage to control costs
286
+ 2. Consider caching common queries
287
+ 3. Use efficient embedding models to reduce computational costs
288
+ 4. Implement rate limiting to prevent abuse
289
+
290
+ ## Disaster Recovery
291
+
292
+ 1. Regularly backup your data directory and index
293
+ 2. Set up automatic failover where possible
294
+ 3. Document recovery procedures for various failure scenarios
295
+
296
+ ## Conclusion
297
+
298
+ This deployment guide outlines the steps to move your Document Chatbot from local development to production. Each environment may require specific adaptations, so consider this a starting point for your deployment journey.
Future_Improvements.md ADDED
@@ -0,0 +1,95 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Future plans to help monitor bad responses so we can improve on replies.
2
+
3
+ > **Note:** We've created an `architecture.md` file in the root directory that outlines our planned modular architecture and development practices. Please refer to that document for guidance on code organization as we implement these improvements.
4
+
5
+ # LlamaIndex Capabilities to Leverage
6
+
7
+ Looking at our current implementation and LlamaIndex's capabilities, here's what we're not using that could help with these issues:
8
+
9
+ ## Document Processing and Metadata:
10
+ - We're using SimpleDirectoryReader but not extracting metadata
11
+ - LlamaIndex has Document class with metadata support
12
+ - We could add title, author, date, etc. during document loading
13
+
14
+ ## Response Quality:
15
+ - We're using basic as_query_engine()
16
+ - LlamaIndex has ResponseMode settings
17
+ - We could use ResponseSynthesizer for better control
18
+ - We could add SimilarityPostprocessor for filtering results
19
+
20
+ ## Feedback and Evaluation:
21
+ - LlamaIndex has an Evaluation module
22
+ - We could use ResponseEvaluator for measuring quality
23
+ - We could implement FeedbackCallback for tracking user feedback
24
+
25
+ ## Document Structure:
26
+ - We're using basic SentenceSplitter
27
+ - LlamaIndex has more sophisticated parsers
28
+ - We could use HierarchicalNodeParser for better structure
29
+
30
+ # Feedback System Implementation Plan
31
+
32
+ ## Phase 1: Enhanced UI for Feedback Collection
33
+
34
+ 1. **Add Basic Feedback UI Components**
35
+ - Add thumbs up/down buttons after each answer
36
+ - Create a collapsible feedback form that appears on downvote
37
+ - Categories for feedback (incorrect, incomplete, irrelevant, other)
38
+ - Optional free-text field for detailed feedback
39
+
40
+ 2. **Modify Session Storage**
41
+ - Enhance `st.session_state.messages` to include feedback status
42
+ - Add temporary in-memory tracking of feedback during session
43
+
44
+ ## Phase 2: Persistent Storage Implementation
45
+
46
+ 1. **Create Database Schema**
47
+ - Design schema for Q&A pairs with feedback:
48
+ ```
49
+ conversations: id, timestamp, user_id(optional)
50
+ messages: id, conversation_id, role, content, timestamp
51
+ feedback: id, message_id, rating, category, comment, timestamp
52
+ ```
53
+
54
+ 2. **Add Database Integration**
55
+ - SQLite for local development (simple setup)
56
+ - Consider PostgreSQL for production
57
+ - Implement functions to save all conversations automatically
58
+
59
+ 3. **Implement Migration**
60
+ - Create a migration script to move existing session data to database
61
+ - Add proper indices for efficient queries
62
+
63
+ ## Phase 3: Feedback Analysis Tools
64
+
65
+ 1. **Create Admin Dashboard**
66
+ - Overview of feedback metrics
67
+ - Filter/sort capabilities (by date, rating, category)
68
+ - Export functionality
69
+
70
+ 2. **Implement LlamaIndex Evaluation Tools**
71
+ - Integrate `ResponseEvaluator` from LlamaIndex
72
+ - Compare ratings from users with automated scores
73
+ - Create quality metrics dashboard
74
+
75
+ 3. **Add Batch Analysis Tools**
76
+ - Identify patterns in negative feedback
77
+ - Group similar problematic questions
78
+ - Automated reporting for common issues
79
+
80
+ ## Phase 4: Continuous Improvement System
81
+
82
+ 1. **Implement Document Enhancement Process**
83
+ - Flag documents related to poor answers
84
+ - Create system for document refinement
85
+ - Track improvements over time
86
+
87
+ 2. **Add A/B Testing Capabilities**
88
+ - Test different retrieval strategies
89
+ - Compare different chunking strategies
90
+ - Evaluate response synthesis approaches
91
+
92
+ 3. **Implement Knowledge Gap Detection**
93
+ - Identify topics with consistent poor performance
94
+ - Suggest new documents needed to fill gaps
95
+ - Track coverage improvement over time
README.md ADDED
@@ -0,0 +1,112 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Document Chatbot
2
+
3
+ A simple chatbot that can answer questions about your documents using LlamaIndex and Anthropic's Claude API.
4
+
5
+ ## Features
6
+
7
+ - Document indexing and search using LlamaIndex
8
+ - Natural language question answering using Claude API
9
+ - Separate user and admin interfaces
10
+ - Handles multiple document formats (PDF, TXT, etc.)
11
+ - Dynamic document reindexing without restarting the application
12
+ - Streamlined UI with persistent "Clear Chat" button
13
+ - Centered layout for improved readability
14
+
15
+ ## Setup
16
+
17
+ 1. Clone this repository:
18
+ ```
19
+ git clone <repository-url>
20
+ cd <repository-directory>
21
+ ```
22
+
23
+ 2. Install the required dependencies:
24
+ ```
25
+ pip install -r requirements.txt
26
+ ```
27
+
28
+ 3. Set up your Anthropic API key by creating a `.env` file with:
29
+ ```
30
+ ANTHROPIC_API_KEY=your-api-key
31
+ ```
32
+
33
+ 4. Place your documents in the `data` directory.
34
+
35
+ 5. (Optional) For the admin interface, create a `.streamlit/secrets.toml` file with:
36
+ ```
37
+ admin_password = "your-secure-password"
38
+ ```
39
+
40
+ ## Usage
41
+
42
+ ### User Interface
43
+
44
+ 1. Start the user interface:
45
+ ```
46
+ streamlit run frontend.py
47
+ ```
48
+
49
+ 2. Open your browser and navigate to `http://localhost:8501`
50
+
51
+ 3. Ask questions about your documents!
52
+
53
+ ### Admin Interface
54
+
55
+ 1. Start the admin interface:
56
+ ```
57
+ streamlit run admin.py
58
+ ```
59
+
60
+ 2. Enter the admin password (default is "admin" if not set in secrets)
61
+
62
+ 3. Use the admin interface to:
63
+ - Upload new documents
64
+ - Delete documents
65
+ - Update the index with new documents
66
+ - Rebuild the entire index
67
+ - View configuration settings
68
+
69
+ ## Interface Separation
70
+
71
+ The application uses two separate interfaces:
72
+
73
+ 1. **User Interface (frontend.py)**: Simple chat interface for asking questions about documents. Users cannot modify documents or the index. Features include:
74
+ - Chat-based question answering
75
+ - Persistent "Clear Chat" button for easy history management
76
+ - Clean, distraction-free UI with hidden Streamlit elements
77
+ - Centered layout for improved readability
78
+
79
+ 2. **Admin Interface (admin.py)**: Protected interface for managing documents and the index. Features include:
80
+ - Document management (upload, delete)
81
+ - Index management (update, rebuild)
82
+ - Configuration settings
83
+ - System information
84
+
85
+ This separation ensures that regular users can't accidentally modify documents or index settings.
86
+
87
+ ## Project Structure
88
+
89
+ - `backend.py`: Core chatbot functionality and document processing
90
+ - `frontend.py`: Streamlit web interface for users
91
+ - `admin.py`: Streamlit admin interface for document management
92
+ - `config.py`: Configuration settings
93
+ - `utils/logging_config.py`: Logging configuration
94
+ - `data/`: Directory for your documents
95
+ - `index/`: Directory for the generated index (created automatically)
96
+ - `architecture.md`: Detailed architecture documentation
97
+ - `DEPLOYMENT.md`: Deployment instructions
98
+ - `Future_Improvements.md`: Planned future enhancements
99
+
100
+ ## Configuration
101
+
102
+ You can modify the chatbot settings in `config.py`, including:
103
+
104
+ - LLM model settings (temperature, max tokens)
105
+ - Embedding model settings
106
+ - Chunking settings for document processing
107
+
108
+ ## Requirements
109
+
110
+ - Python 3.7+
111
+ - Anthropic API key
112
+ - See `requirements.txt` for full dependencies
admin.py ADDED
@@ -0,0 +1,248 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Admin interface for document management and reindexing.
3
+ """
4
+ import streamlit as st
5
+ import os
6
+ import shutil
7
+ from backend import Chatbot
8
+ from utils.logging_config import setup_logging
9
+ import config
10
+
11
+ # Setup logging
12
+ logger = setup_logging()
13
+
14
+ # Set page config
15
+ st.set_page_config(
16
+ page_title="Document Chatbot Admin",
17
+ page_icon="🔧",
18
+ layout="wide"
19
+ )
20
+
21
+ # Hide just the deploy button
22
+ hide_deploy_button = """
23
+ <style>
24
+ button[kind="deploy"],
25
+ .stDeployButton {
26
+ display: none !important;
27
+ }
28
+ </style>
29
+ """
30
+ st.markdown(hide_deploy_button, unsafe_allow_html=True)
31
+
32
+ # Simple authentication
33
+ def check_password():
34
+ """Returns `True` if the user had the correct password."""
35
+ def password_entered():
36
+ """Checks whether a password entered by the user is correct."""
37
+ try:
38
+ correct_password = st.secrets.get("admin_password", "admin")
39
+ except Exception:
40
+ # Fallback to default password if secrets are not available
41
+ correct_password = "admin"
42
+
43
+ if st.session_state["password"] == correct_password:
44
+ st.session_state["password_correct"] = True
45
+ del st.session_state["password"] # Don't store the password
46
+ else:
47
+ st.session_state["password_correct"] = False
48
+
49
+ if "password_correct" not in st.session_state:
50
+ # First run, show input for password
51
+ st.text_input(
52
+ "Password", type="password", on_change=password_entered, key="password"
53
+ )
54
+ return False
55
+ elif not st.session_state["password_correct"]:
56
+ # Password not correct, show input + error
57
+ st.text_input(
58
+ "Password", type="password", on_change=password_entered, key="password"
59
+ )
60
+ st.error("😕 Password incorrect")
61
+ return False
62
+ else:
63
+ # Password correct
64
+ return True
65
+
66
+ # Initialize chatbot
67
+ def initialize_chatbot():
68
+ if "chatbot" not in st.session_state:
69
+ with st.spinner("Initializing chatbot..."):
70
+ # Get configuration from config module
71
+ chatbot_config = config.get_chatbot_config()
72
+
73
+ # Initialize chatbot
74
+ logger.info("Initializing chatbot...")
75
+ st.session_state.chatbot = Chatbot(chatbot_config)
76
+
77
+ # Load documents and create index
78
+ documents = st.session_state.chatbot.load_documents()
79
+ st.session_state.chatbot.create_index(documents)
80
+ st.session_state.chatbot.initialize_query_engine()
81
+ logger.info("Chatbot initialized successfully")
82
+
83
+ # Main admin interface
84
+ if check_password():
85
+ # Initialize the chatbot
86
+ initialize_chatbot()
87
+
88
+ # Admin header
89
+ st.title("🔧 Document Chatbot Admin")
90
+ st.markdown("""
91
+ This is the admin interface for managing documents and index settings.
92
+ """)
93
+
94
+ # Tabs for different admin functions
95
+ tab1, tab2, tab3 = st.tabs(["Document Management", "Index Management", "Configuration"])
96
+
97
+ # Document Management Tab
98
+ with tab1:
99
+ st.header("Document Management")
100
+
101
+ # Show current documents
102
+ data_dir = "data"
103
+ col1, col2 = st.columns([3, 1])
104
+
105
+ with col1:
106
+ st.subheader("Current Documents")
107
+ if os.path.exists(data_dir):
108
+ documents = [f for f in os.listdir(data_dir) if not f.startswith('.')]
109
+ if documents:
110
+ for doc in documents:
111
+ doc_col1, doc_col2 = st.columns([5, 1])
112
+ doc_col1.write(f"📄 {doc}")
113
+ if doc_col2.button("Delete", key=f"delete_{doc}"):
114
+ try:
115
+ os.remove(os.path.join(data_dir, doc))
116
+ st.success(f"Deleted {doc}")
117
+ st.rerun()
118
+ except Exception as e:
119
+ st.error(f"Error deleting {doc}: {str(e)}")
120
+ else:
121
+ st.info("No documents found in data directory.")
122
+ else:
123
+ st.error("Data directory not found.")
124
+ if st.button("Create Data Directory"):
125
+ os.makedirs(data_dir)
126
+ st.success("Data directory created.")
127
+ st.rerun()
128
+
129
+ with col2:
130
+ st.subheader("Upload Document")
131
+ uploaded_file = st.file_uploader("Choose a file", type=["txt", "pdf", "docx", "md"])
132
+
133
+ if uploaded_file is not None:
134
+ # Save the uploaded file
135
+ if not os.path.exists(data_dir):
136
+ os.makedirs(data_dir)
137
+
138
+ file_path = os.path.join(data_dir, uploaded_file.name)
139
+ with open(file_path, "wb") as f:
140
+ f.write(uploaded_file.getbuffer())
141
+
142
+ st.success(f"File {uploaded_file.name} saved successfully!")
143
+
144
+ # Index Management Tab
145
+ with tab2:
146
+ st.header("Index Management")
147
+
148
+ # Index Status
149
+ st.subheader("Index Status")
150
+ index_dir = "index"
151
+ if os.path.exists(index_dir):
152
+ index_files = os.listdir(index_dir)
153
+ if index_files:
154
+ st.write(f"Index exists with {len(index_files)} files")
155
+ total_size = sum(os.path.getsize(os.path.join(index_dir, f)) for f in index_files)
156
+ st.write(f"Total index size: {total_size/1024:.2f} KB")
157
+
158
+ # Display index files
159
+ with st.expander("Index Files"):
160
+ for file in index_files:
161
+ st.write(f"- {file}: {os.path.getsize(os.path.join(index_dir, file))/1024:.2f} KB")
162
+ else:
163
+ st.warning("Index directory exists but is empty.")
164
+ else:
165
+ st.error("Index directory not found.")
166
+
167
+ # Reindex Options
168
+ st.subheader("Reindex Options")
169
+
170
+ col1, col2 = st.columns(2)
171
+
172
+ with col1:
173
+ if st.button("Update Index", help="Add new documents to the existing index"):
174
+ with st.spinner("Updating index..."):
175
+ try:
176
+ # Load fresh documents
177
+ documents = st.session_state.chatbot.load_documents()
178
+ # Update the index with new documents
179
+ st.session_state.chatbot.update_index(documents)
180
+ st.success("Index updated successfully!")
181
+ logger.info("Index updated successfully")
182
+ except Exception as e:
183
+ st.error(f"Error updating index: {str(e)}")
184
+ logger.error(f"Error updating index: {e}")
185
+
186
+ with col2:
187
+ if st.button("Rebuild Index", help="Delete and recreate the entire index"):
188
+ with st.spinner("Rebuilding index..."):
189
+ try:
190
+ # Delete index directory
191
+ if os.path.exists(index_dir):
192
+ shutil.rmtree(index_dir)
193
+ os.makedirs(index_dir)
194
+ logger.info("Index directory cleared")
195
+
196
+ # Load documents and recreate index
197
+ documents = st.session_state.chatbot.load_documents()
198
+ st.session_state.chatbot.create_index(documents)
199
+ st.session_state.chatbot.initialize_query_engine()
200
+
201
+ st.success("Index rebuilt successfully!")
202
+ logger.info("Index rebuilt successfully")
203
+ except Exception as e:
204
+ st.error(f"Error rebuilding index: {str(e)}")
205
+ logger.error(f"Error rebuilding index: {e}")
206
+
207
+ # Configuration Tab
208
+ with tab3:
209
+ st.header("Configuration")
210
+
211
+ # Display current configuration
212
+ st.subheader("Current Configuration")
213
+ chatbot_config = config.get_chatbot_config()
214
+
215
+ # LLM Settings
216
+ st.write("#### LLM Settings")
217
+ llm_col1, llm_col2 = st.columns(2)
218
+
219
+ model = llm_col1.text_input("Model", value=chatbot_config.get("model", "claude-3-7-sonnet-20250219"))
220
+ temperature = llm_col2.slider("Temperature", min_value=0.0, max_value=1.0, value=chatbot_config.get("temperature", 0.1), step=0.1)
221
+
222
+ # Embedding Settings
223
+ st.write("#### Embedding Settings")
224
+ emb_col1, emb_col2 = st.columns(2)
225
+
226
+ embedding_model = emb_col1.text_input("Embedding Model", value=chatbot_config.get("embedding_model", "sentence-transformers/all-MiniLM-L6-v2"))
227
+ device = emb_col2.selectbox("Device", options=["cpu", "cuda"], index=0 if chatbot_config.get("device", "cpu") == "cpu" else 1)
228
+
229
+ # Text Splitter Settings
230
+ st.write("#### Text Splitter Settings")
231
+ split_col1, split_col2 = st.columns(2)
232
+
233
+ chunk_size = split_col1.number_input("Chunk Size", min_value=100, max_value=4096, value=chatbot_config.get("chunk_size", 1024), step=100)
234
+ chunk_overlap = split_col2.number_input("Chunk Overlap", min_value=0, max_value=1000, value=chatbot_config.get("chunk_overlap", 100), step=10)
235
+
236
+ # Save Configuration Button
237
+ if st.button("Save Configuration"):
238
+ st.warning("Configuration saving is not implemented yet.")
239
+ # In a real implementation, you would update the config file or database
240
+
241
+ # Footer
242
+ st.markdown("---")
243
+ st.markdown("⚠️ **Note**: This admin interface should be protected and not accessible to regular users.")
244
+
245
+ else:
246
+ # Show login screen only
247
+ st.title("🔧 Document Chatbot Admin")
248
+ st.markdown("Please enter the password to access the admin interface.")
app.py CHANGED
@@ -1,6 +1,13 @@
 
 
 
1
  import streamlit as st
2
  from chatbot import Chatbot
3
- import time
 
 
 
 
4
 
5
  # Set page config
6
  st.set_page_config(
@@ -16,20 +23,18 @@ if "messages" not in st.session_state:
16
  # Initialize chatbot
17
  if "chatbot" not in st.session_state:
18
  with st.spinner("Initializing chatbot..."):
19
- config = {
20
- "model": "claude-3-7-sonnet-20250219",
21
- "temperature": 0.1,
22
- "max_tokens": 2048,
23
- "embedding_model": "sentence-transformers/all-MiniLM-L6-v2",
24
- "device": "cpu",
25
- "embed_batch_size": 8,
26
- "chunk_size": 1024,
27
- "chunk_overlap": 100
28
- }
29
- st.session_state.chatbot = Chatbot(config)
30
- st.session_state.chatbot.load_documents()
31
- st.session_state.chatbot.create_index(st.session_state.chatbot.load_documents())
32
  st.session_state.chatbot.initialize_query_engine()
 
33
 
34
  # Title and description
35
  st.title("📚 Document Chatbot")
@@ -53,9 +58,11 @@ if prompt := st.chat_input("What would you like to know?"):
53
  # Get chatbot response
54
  with st.chat_message("assistant"):
55
  with st.spinner("Thinking..."):
 
56
  response = st.session_state.chatbot.query(prompt)
57
  st.markdown(response)
58
  st.session_state.messages.append({"role": "assistant", "content": response})
 
59
 
60
  # Sidebar with information
61
  with st.sidebar:
@@ -72,4 +79,5 @@ with st.sidebar:
72
  # Add a clear chat button
73
  if st.button("Clear Chat History"):
74
  st.session_state.messages = []
 
75
  st.rerun()
 
1
+ """
2
+ Streamlit web interface for the chatbot application.
3
+ """
4
  import streamlit as st
5
  from chatbot import Chatbot
6
+ from utils.logging_config import setup_logging
7
+ import config
8
+
9
+ # Setup logging
10
+ logger = setup_logging()
11
 
12
  # Set page config
13
  st.set_page_config(
 
23
  # Initialize chatbot
24
  if "chatbot" not in st.session_state:
25
  with st.spinner("Initializing chatbot..."):
26
+ # Get configuration from config module
27
+ chatbot_config = config.get_chatbot_config()
28
+
29
+ # Initialize chatbot
30
+ logger.info("Initializing chatbot...")
31
+ st.session_state.chatbot = Chatbot(chatbot_config)
32
+
33
+ # Load documents and create index
34
+ documents = st.session_state.chatbot.load_documents()
35
+ st.session_state.chatbot.create_index(documents)
 
 
 
36
  st.session_state.chatbot.initialize_query_engine()
37
+ logger.info("Chatbot initialized successfully")
38
 
39
  # Title and description
40
  st.title("📚 Document Chatbot")
 
58
  # Get chatbot response
59
  with st.chat_message("assistant"):
60
  with st.spinner("Thinking..."):
61
+ logger.info(f"User query: {prompt}")
62
  response = st.session_state.chatbot.query(prompt)
63
  st.markdown(response)
64
  st.session_state.messages.append({"role": "assistant", "content": response})
65
+ logger.info("Response provided to user")
66
 
67
  # Sidebar with information
68
  with st.sidebar:
 
79
  # Add a clear chat button
80
  if st.button("Clear Chat History"):
81
  st.session_state.messages = []
82
+ logger.info("Chat history cleared")
83
  st.rerun()
architecture.md ADDED
@@ -0,0 +1,305 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Document Chatbot Architecture
2
+
3
+ This document describes the architecture of the Document Chatbot application, its key components, and design decisions.
4
+
5
+ ## Application Components
6
+
7
+ The application follows a clear separation of concerns with separate frontend and admin interfaces:
8
+
9
+ ### User Interface (`frontend.py`)
10
+
11
+ - Built with Streamlit
12
+ - Provides a chat-like interface for user interaction
13
+ - Focused solely on the chat experience
14
+ - Streamlined UI with hidden Streamlit elements
15
+ - Centered layout for improved readability
16
+ - Persistent "Clear Chat" button in collapsible sidebar
17
+ - Maintains session state for consistent user experience
18
+
19
+ ### Admin Interface (`admin.py`)
20
+
21
+ - Separate Streamlit application with password protection
22
+ - Provides administrative functionality:
23
+ - Document management (upload, delete)
24
+ - Index management (update, rebuild)
25
+ - Configuration settings
26
+ - System information
27
+ - Organized into tabs for different functionality
28
+ - Protected with a simple password authentication system
29
+
30
+ ### Backend (`backend.py`)
31
+
32
+ - Core chatbot functionality
33
+ - Handles document loading, indexing, and querying
34
+ - Manages the LlamaIndex and Claude API interactions
35
+ - Implements error handling and retry logic
36
+ - Provides document reindexing capabilities
37
+ - Shared by both the user and admin interfaces
38
+
39
+ ### Configuration (`config.py`)
40
+
41
+ - Centralizes application settings
42
+ - Manages API keys and model parameters
43
+ - Configures embedding and chunking settings
44
+
45
+ ### Utilities
46
+
47
+ - Logging configuration
48
+ - Helper functions
49
+
50
+ ## Data Flow
51
+
52
+ ### User Interface Flow
53
+
54
+ 1. User submits a question through the Streamlit UI
55
+ 2. The frontend passes the question to the backend
56
+ 3. The backend:
57
+ - Processes the question
58
+ - Retrieves relevant documents from the index
59
+ - Generates a response using the Claude API
60
+ 4. The response is displayed to the user in the UI
61
+
62
+ ### Admin Interface Flow
63
+
64
+ 1. Admin enters password to access the interface
65
+ 2. Admin manages documents and index:
66
+ - Uploads new documents to the data directory
67
+ - Deletes existing documents
68
+ - Updates the index with new documents
69
+ - Rebuilds the index completely if needed
70
+ 3. Admin views system information and configuration settings
71
+
72
+ ## Document Indexing System
73
+
74
+ The application includes a dynamic document indexing system that allows adding new documents without disrupting the service.
75
+
76
+ ### How Document Indexing Works
77
+
78
+ 1. **Document Loading**: The `load_documents()` method in `backend.py` uses LlamaIndex's `SimpleDirectoryReader` to read all documents from the `data` directory.
79
+
80
+ 2. **Index Management**:
81
+ - If no index exists, `create_index()` creates a new one
82
+ - For adding new documents, `update_index()` adds them to the existing index
83
+
84
+ 3. **Efficient Updates**:
85
+ - The `update_index()` method inserts each new document into the existing index
86
+ - Only new documents are processed, avoiding rebuilding the entire index
87
+ - The updated index is persisted to disk
88
+
89
+ 4. **Admin Interface**:
90
+ - The admin interface provides buttons for updating or rebuilding the index
91
+ - Document uploading and deletion happens in the admin interface
92
+ - Success/failure feedback is displayed to the admin
93
+
94
+ ### Benefits of the Approach
95
+
96
+ - **Role Separation**: Users interact with content while admins manage it
97
+ - **Zero Downtime**: Users can continue using the application during reindexing
98
+ - **Efficiency**: Only processes new documents instead of rebuilding the entire index
99
+ - **Security**: Administrative functions are protected by authentication
100
+
101
+ ## Authentication
102
+
103
+ The admin interface uses a simple password-based authentication system:
104
+
105
+ 1. Password is stored in `.streamlit/secrets.toml`
106
+ 2. Default password is "admin" if not set in secrets
107
+ 3. Session state is used to track authentication status
108
+
109
+ This provides basic security to prevent unauthorized access to administrative functions.
110
+
111
+ ## Future Considerations
112
+
113
+ For larger-scale deployments:
114
+
115
+ 1. **Database-Backed Indices**: Switch to PostgreSQL-backed vector store for better concurrency
116
+ 2. **Background Processing**: Move reindexing to background tasks for larger document sets
117
+ 3. **API-Based Architecture**: Separate frontend and backend completely with REST API
118
+ 4. **Enhanced Authentication**: Implement more robust authentication using OAuth or SAML
119
+ 5. **Document Change Detection**: Implement file system watchers for automatic reindexing
120
+
121
+ ## Design Decisions
122
+
123
+ 1. **Separate Interfaces**: Creating distinct interfaces for users and admins ensures clean separation of concerns
124
+
125
+ 2. **LlamaIndex for Document Processing**: Chosen for its powerful document handling capabilities and simple API
126
+
127
+ 3. **Claude API for Question Answering**: Selected for high-quality responses and good context handling
128
+
129
+ 4. **Streamlit for UI**: Allows for rapid development of interactive interfaces with minimal code
130
+
131
+ 5. **Simple Folder-Based Document Storage**: Prioritizes ease of use over complex document management
132
+
133
+ 6. **UI Optimization**:
134
+ - Centered layout for improved readability
135
+ - Hidden deployment buttons and unnecessary Streamlit elements
136
+ - Persistent "Clear Chat" button in collapsible sidebar
137
+ - Consistent styling and color scheme
138
+
139
+ ## Ideal File Organization & Structure
140
+
141
+ We aim to maintain a modular, well-organized codebase with these principles:
142
+
143
+ ### File Size Guidelines
144
+ - **Python files**: 300-500 lines maximum
145
+ - **Single modules/classes**: 100-200 lines
146
+ - **Functions**: Under 50 lines each
147
+
148
+ Keeping files within these limits improves:
149
+ - Code navigation and readability
150
+ - Mental model of component relationships
151
+ - Ability to make targeted changes
152
+ - Testing and maintenance
153
+
154
+ ### Recommended Directory Structure
155
+
156
+ ```
157
+ llamaindex-app/
158
+ ├── frontend.py # Streamlit frontend interface
159
+ ├── backend.py # Runnable backend with CLI interface
160
+ ├── config.py # Configuration settings
161
+ ├── requirements.txt # Dependencies
162
+ ├── .env # Environment variables (not in git)
163
+ ├── architecture.md # This document
164
+ ├── Future_Improvements.md # Planned improvements
165
+ ├── chatbot/ # Additional chatbot modules (future)
166
+ │ ├── __init__.py
167
+ │ ├── document_loader.py # Document processing (future)
168
+ │ └── llm.py # LLM configuration (future)
169
+ ├── database/ # Database functionality
170
+ │ ├── __init__.py
171
+ │ ├── models.py # Database schema models (future)
172
+ │ ├── repository.py # Data access layer (future)
173
+ │ └── migrations/ # Database migrations (future)
174
+ ├── feedback/ # Feedback system
175
+ │ ├── __init__.py
176
+ │ ├── collector.py # UI and gathering functions (future)
177
+ │ ├── storage.py # Saving feedback data (future)
178
+ │ └── analysis.py # Feedback analysis tools (future)
179
+ ├── admin/ # Admin dashboard
180
+ │ ├── __init__.py
181
+ │ ├── dashboard.py # Main admin interface (future)
182
+ │ └── reports.py # Reporting functionality (future)
183
+ ├── utils/ # Shared utilities
184
+ │ ├── __init__.py
185
+ │ ├── logging_config.py # Logging configuration
186
+ │ └── helpers.py # Common helper functions (future)
187
+ ├── data/ # Your document data
188
+ ├── index/ # Vector index storage
189
+ └── tests/ # Test suite
190
+ ├── test_config.py # Config module tests
191
+ ├── test_chatbot.py # Chatbot tests (future)
192
+ └── test_feedback.py # Feedback tests (future)
193
+ ```
194
+
195
+ ## Architecture Principles
196
+
197
+ ### 1. Frontend/Backend Separation
198
+ - `frontend.py` provides a web interface using Streamlit
199
+ - `backend.py` contains the core chatbot logic and a CLI interface
200
+ - Both can be run independently depending on user needs
201
+
202
+ ### 2. Dependency Injection
203
+ - Components receive dependencies rather than creating them
204
+ - Makes testing and switching implementations easier
205
+ - Facilitates mocking for unit tests
206
+
207
+ ### 3. Deployment Options
208
+ - Run just the backend for command-line interaction
209
+ - Run the frontend for a web-based interface
210
+ - Both use the same underlying chatbot functionality
211
+
212
+ ### 4. Configuration Management
213
+ - External configuration for environment-specific settings
214
+ - Environment variables for sensitive data
215
+ - Config files for application settings
216
+
217
+ ### 5. Database Abstraction
218
+ - Repository pattern to abstract database operations
219
+ - Migration system for schema changes
220
+ - ORM for database portability
221
+
222
+ ## Cloud-Ready Development Practices
223
+
224
+ To ensure smooth transition to cloud hosting:
225
+
226
+ ### 1. Environment Configuration
227
+ - Use `.env` files locally
228
+ - Design for environment variables in production
229
+ - Create a `config.py` that handles both scenarios
230
+
231
+ ```python
232
+ # Example config.py approach
233
+ import os
234
+ from dotenv import load_dotenv
235
+
236
+ # Load .env file if it exists
237
+ load_dotenv()
238
+
239
+ # Configuration with fallbacks
240
+ DATABASE_URL = os.getenv("DATABASE_URL", "sqlite:///local.db")
241
+ API_KEY = os.getenv("ANTHROPIC_API_KEY")
242
+ DEBUG = os.getenv("DEBUG", "False").lower() == "true"
243
+ ```
244
+
245
+ ### 2. Database Strategy
246
+ - Use SQLite locally but design for PostgreSQL/MySQL
247
+ - Use SQLAlchemy as ORM for database portability
248
+ - Implement connection pooling for production
249
+
250
+ ### 3. File Storage
251
+ - Design with cloud storage in mind (S3-compatible interface)
252
+ - Abstract file operations behind a service interface
253
+ - Plan for distributed storage in production
254
+
255
+ ### 4. Stateless Design
256
+ - Don't rely on local file system for temporary storage
257
+ - Use database or distributed cache for shared state
258
+ - Design components to be horizontally scalable
259
+
260
+ ### 5. Containerization
261
+ - Use Docker during development
262
+ - Create a Dockerfile and docker-compose.yml early
263
+ - Plan for orchestration (Kubernetes, ECS) in production
264
+
265
+ ## Implementation Roadmap
266
+
267
+ 1. **Refactoring (Current Phase)**
268
+ - Move `chatbot.py` functionality into organized modules
269
+ - Implement basic dependency injection
270
+ - Setup configuration management
271
+
272
+ 2. **Database Integration**
273
+ - Implement SQLAlchemy models for feedback
274
+ - Create repository classes for data access
275
+ - Add migration system (Alembic)
276
+
277
+ 3. **Feedback System**
278
+ - Add UI components for feedback
279
+ - Connect to database layer
280
+ - Implement storage and retrieval
281
+
282
+ 4. **Analysis Tools**
283
+ - Build admin dashboard
284
+ - Implement feedback analysis functions
285
+ - Create reporting capabilities
286
+
287
+ ## Deployment Considerations
288
+
289
+ ### Development Environment
290
+ - Local development with SQLite
291
+ - Environment variables for secrets
292
+ - Local vector index storage
293
+
294
+ ### Production Environment
295
+ - Cloud hosting (AWS, Azure, GCP)
296
+ - Managed database service
297
+ - Vector database service or optimized self-hosted
298
+ - Container orchestration for scaling
299
+
300
+ ## Monitoring and Maintenance
301
+
302
+ - Implement logging throughout the application
303
+ - Track key metrics (response times, user feedback)
304
+ - Regular database backups
305
+ - Version control for document corpus and vector indices
backend.py ADDED
@@ -0,0 +1,270 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
2
+ from llama_index.llms.anthropic import Anthropic
3
+ from llama_index.embeddings.huggingface import HuggingFaceEmbedding
4
+ from llama_index.core.node_parser import SentenceSplitter
5
+ from llama_index.core.callbacks import CallbackManager, LlamaDebugHandler
6
+ from llama_index.core import StorageContext, load_index_from_storage
7
+ import logging
8
+ import os
9
+ from dotenv import load_dotenv
10
+ import time
11
+ from typing import Optional, Dict, Any, List
12
+ from tqdm import tqdm
13
+
14
+ # Set up logging to track what the chatbot is doing
15
+ logging.basicConfig(
16
+ level=logging.INFO,
17
+ format='%(asctime)s - %(levelname)s - %(message)s'
18
+ )
19
+ logger = logging.getLogger(__name__)
20
+
21
+ # Disable tokenizer parallelism warnings
22
+ os.environ["TOKENIZERS_PARALLELISM"] = "false"
23
+
24
+ # Create a directory for storing the index
25
+ INDEX_DIR = "index"
26
+ if not os.path.exists(INDEX_DIR):
27
+ os.makedirs(INDEX_DIR)
28
+
29
+ class Chatbot:
30
+ def __init__(self, config: Optional[Dict[str, Any]] = None):
31
+ """Initialize the chatbot with configuration."""
32
+ # Set up basic variables and load configuration
33
+ self.config = config or {}
34
+ self.api_key = self._get_api_key()
35
+ self.index = None
36
+ self.query_engine = None
37
+ self.llm = None
38
+ self.embed_model = None
39
+
40
+ # Set up debugging tools to help track any issues
41
+ self.debug_handler = LlamaDebugHandler(print_trace_on_end=True)
42
+ self.callback_manager = CallbackManager([self.debug_handler])
43
+
44
+ # Set up all the components needed for the chatbot
45
+ self._initialize_components()
46
+
47
+ def _get_api_key(self) -> str:
48
+ """Get API key from environment or config."""
49
+ # Load the API key from environment variables or config file
50
+ load_dotenv()
51
+ api_key = os.getenv("ANTHROPIC_API_KEY") or self.config.get("api_key")
52
+ if not api_key:
53
+ raise ValueError("API key not found in environment or config")
54
+ return api_key
55
+
56
+ def _initialize_components(self):
57
+ """Initialize all components with proper error handling."""
58
+ try:
59
+ # Set up the language model (Claude) with our settings
60
+ logger.info("Setting up Claude language model...")
61
+ self.llm = Anthropic(
62
+ api_key=self.api_key,
63
+ model=self.config.get("model", "claude-3-7-sonnet-20250219"),
64
+ temperature=self.config.get("temperature", 0.1),
65
+ max_tokens=self.config.get("max_tokens", 2048) # Allow for longer responses
66
+ )
67
+
68
+ # Set up the model that converts text into numbers (embeddings)
69
+ logger.info("Setting up text embedding model...")
70
+ self.embed_model = HuggingFaceEmbedding(
71
+ model_name=self.config.get("embedding_model", "sentence-transformers/all-MiniLM-L6-v2"),
72
+ device=self.config.get("device", "cpu"),
73
+ embed_batch_size=self.config.get("embed_batch_size", 8)
74
+ )
75
+
76
+ # Configure all the settings for the chatbot
77
+ logger.info("Configuring chatbot settings...")
78
+ Settings.embed_model = self.embed_model
79
+ Settings.text_splitter = SentenceSplitter(
80
+ chunk_size=self.config.get("chunk_size", 1024),
81
+ chunk_overlap=self.config.get("chunk_overlap", 100),
82
+ paragraph_separator="\n\n"
83
+ )
84
+ Settings.llm = self.llm
85
+ Settings.callback_manager = self.callback_manager
86
+
87
+ logger.info("Components initialized successfully")
88
+
89
+ except Exception as e:
90
+ logger.error(f"Error initializing components: {e}")
91
+ raise
92
+
93
+ def load_documents(self, data_dir: str = "data"):
94
+ """Load documents with retry logic."""
95
+ # Try to load documents up to 3 times if there's an error
96
+ max_retries = 3
97
+ retry_delay = 1
98
+
99
+ for attempt in range(max_retries):
100
+ try:
101
+ logger.info(f"Loading documents from {data_dir}...")
102
+ documents = SimpleDirectoryReader(data_dir).load_data()
103
+ logger.info(f"Loaded {len(documents)} documents")
104
+ return documents
105
+ except Exception as e:
106
+ if attempt < max_retries - 1:
107
+ logger.warning(f"Attempt {attempt + 1} failed: {e}. Retrying in {retry_delay} seconds...")
108
+ time.sleep(retry_delay)
109
+ else:
110
+ logger.error(f"Failed to load documents after {max_retries} attempts: {e}")
111
+ raise
112
+
113
+ def create_index(self, documents):
114
+ """Create index with error handling."""
115
+ try:
116
+ # Check if index already exists
117
+ if os.path.exists(os.path.join(INDEX_DIR, "index.json")):
118
+ logger.info("Loading existing index...")
119
+ storage_context = StorageContext.from_defaults(persist_dir=INDEX_DIR)
120
+ self.index = load_index_from_storage(storage_context)
121
+ logger.info("Index loaded successfully")
122
+ return
123
+
124
+ # Create a new index if none exists
125
+ logger.info("Creating new index...")
126
+ with tqdm(total=1, desc="Creating searchable index") as pbar:
127
+ self.index = VectorStoreIndex.from_documents(documents)
128
+ # Save the index
129
+ self.index.storage_context.persist(persist_dir=INDEX_DIR)
130
+ pbar.update(1)
131
+ logger.info("Index created and saved successfully")
132
+ except Exception as e:
133
+ logger.error(f"Error creating/loading index: {e}")
134
+ raise
135
+
136
+ def update_index(self, new_documents: List):
137
+ """Update existing index with new documents without rebuilding.
138
+
139
+ Args:
140
+ new_documents: List of new documents to add to the index
141
+
142
+ Returns:
143
+ None, updates self.index in place
144
+ """
145
+ try:
146
+ if self.index is None:
147
+ logger.warning("No existing index found. Creating new index instead.")
148
+ self.create_index(new_documents)
149
+ return
150
+
151
+ logger.info(f"Updating index with {len(new_documents)} new documents...")
152
+ with tqdm(total=1, desc="Updating searchable index") as pbar:
153
+ # Insert the new documents into the existing index
154
+ for doc in new_documents:
155
+ self.index.insert(doc)
156
+
157
+ # Persist the updated index
158
+ self.index.storage_context.persist(persist_dir=INDEX_DIR)
159
+ pbar.update(1)
160
+
161
+ logger.info("Index updated and saved successfully")
162
+
163
+ # Reinitialize query engine with updated index
164
+ self.initialize_query_engine()
165
+
166
+ except Exception as e:
167
+ logger.error(f"Error updating index: {e}")
168
+ raise
169
+
170
+ def initialize_query_engine(self):
171
+ """Initialize query engine with error handling."""
172
+ try:
173
+ # Set up the system that will handle questions
174
+ logger.info("Initializing query engine...")
175
+ self.query_engine = self.index.as_query_engine()
176
+ logger.info("Query engine initialized successfully")
177
+ except Exception as e:
178
+ logger.error(f"Error initializing query engine: {e}")
179
+ raise
180
+
181
+ def query(self, query_text: str) -> str:
182
+ """Execute a query with error handling and retries."""
183
+ # Try to answer questions up to 3 times if there's an error
184
+ max_retries = 3
185
+ retry_delay = 1
186
+
187
+ for attempt in range(max_retries):
188
+ try:
189
+ logger.info(f"Executing query: {query_text}")
190
+ print("\nThinking...", end="", flush=True)
191
+ response = self.query_engine.query(query_text)
192
+ print(" Done!")
193
+ logger.info("Query executed successfully")
194
+ return str(response)
195
+ except Exception as e:
196
+ if attempt < max_retries - 1:
197
+ logger.warning(f"Attempt {attempt + 1} failed: {e}. Retrying in {retry_delay} seconds...")
198
+ time.sleep(retry_delay)
199
+ else:
200
+ logger.error(f"Failed to execute query after {max_retries} attempts: {e}")
201
+ raise
202
+
203
+ def cleanup(self):
204
+ """Clean up resources."""
205
+ try:
206
+ # Clean up any resources we used
207
+ logger.info("Cleaning up resources...")
208
+ logger.info("Cleanup completed successfully")
209
+ except Exception as e:
210
+ logger.error(f"Error during cleanup: {e}")
211
+
212
+ def main():
213
+ # Set up all the configuration settings for the chatbot
214
+ config = {
215
+ "model": "claude-3-7-sonnet-20250219",
216
+ "temperature": 0.1,
217
+ "max_tokens": 2048, # Allow for longer responses
218
+ "embedding_model": "sentence-transformers/all-MiniLM-L6-v2",
219
+ "device": "cpu",
220
+ "embed_batch_size": 8,
221
+ "chunk_size": 1024,
222
+ "chunk_overlap": 100
223
+ }
224
+
225
+ chatbot = None
226
+ try:
227
+ # Create and set up the chatbot
228
+ print("\nInitializing chatbot...")
229
+ chatbot = Chatbot(config)
230
+
231
+ # Load the documents we want to analyze
232
+ documents = chatbot.load_documents()
233
+
234
+ # Create a searchable index from the documents
235
+ chatbot.create_index(documents)
236
+
237
+ # Set up the system that will handle questions
238
+ chatbot.initialize_query_engine()
239
+
240
+ print("\nChatbot is ready! You can ask questions about your documents.")
241
+ print("Type 'exit' to quit.")
242
+ print("-" * 50)
243
+
244
+ while True:
245
+ # Get user input
246
+ question = input("\nYour question: ").strip()
247
+
248
+ # Check if user wants to exit
249
+ if question.lower() in ['exit', 'quit', 'bye']:
250
+ print("\nGoodbye!")
251
+ break
252
+
253
+ # Skip empty questions
254
+ if not question:
255
+ continue
256
+
257
+ # Get and print the response
258
+ response = chatbot.query(question)
259
+ print("\nAnswer:", response)
260
+ print("-" * 50)
261
+
262
+ except Exception as e:
263
+ logger.error(f"Error in main: {e}")
264
+ finally:
265
+ # Clean up when we're done
266
+ if chatbot:
267
+ chatbot.cleanup()
268
+
269
+ if __name__ == "__main__":
270
+ main()
chatbot/__init__.py ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ """
2
+ Chatbot module for document question answering using LlamaIndex.
3
+ """
4
+ from chatbot.core import Chatbot
5
+
6
+ __all__ = ["Chatbot"]
chatbot/core.py ADDED
@@ -0,0 +1,223 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Core chatbot implementation for document question answering.
3
+ """
4
+ import logging
5
+ import os
6
+ import time
7
+ from typing import Optional, Dict, Any, List
8
+
9
+ from tqdm import tqdm
10
+ from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings, StorageContext, load_index_from_storage
11
+ from llama_index.llms.anthropic import Anthropic
12
+ from llama_index.embeddings.huggingface import HuggingFaceEmbedding
13
+ from llama_index.core.node_parser import SentenceSplitter
14
+ from llama_index.core.callbacks import CallbackManager, LlamaDebugHandler
15
+
16
+ import config
17
+
18
+ # Configure logging
19
+ logger = logging.getLogger(__name__)
20
+
21
+ class Chatbot:
22
+ """Chatbot for document question answering using LlamaIndex."""
23
+
24
+ def __init__(self, config_dict: Optional[Dict[str, Any]] = None):
25
+ """Initialize the chatbot with configuration.
26
+
27
+ Args:
28
+ config_dict: Optional configuration dictionary. If not provided,
29
+ configuration is loaded from environment variables.
30
+ """
31
+ # Set up basic variables and load configuration
32
+ self.config = config_dict or config.get_chatbot_config()
33
+ self.api_key = self._get_api_key()
34
+ self.index = None
35
+ self.query_engine = None
36
+ self.llm = None
37
+ self.embed_model = None
38
+
39
+ # Set up debugging tools to help track any issues
40
+ self.debug_handler = LlamaDebugHandler(print_trace_on_end=True)
41
+ self.callback_manager = CallbackManager([self.debug_handler])
42
+
43
+ # Set up all the components needed for the chatbot
44
+ self._initialize_components()
45
+
46
+ def _get_api_key(self) -> str:
47
+ """Get API key from environment or config.
48
+
49
+ Returns:
50
+ API key as string
51
+
52
+ Raises:
53
+ ValueError: If API key is not found
54
+ """
55
+ api_key = config.ANTHROPIC_API_KEY or self.config.get("api_key")
56
+ if not api_key:
57
+ raise ValueError("API key not found in environment or config")
58
+ return api_key
59
+
60
+ def _initialize_components(self):
61
+ """Initialize all components with proper error handling.
62
+
63
+ Sets up the LLM, embedding model, and other settings.
64
+
65
+ Raises:
66
+ Exception: If component initialization fails
67
+ """
68
+ try:
69
+ # Set up the language model (Claude) with our settings
70
+ logger.info("Setting up Claude language model...")
71
+ self.llm = Anthropic(
72
+ api_key=self.api_key,
73
+ model=self.config.get("model", config.LLM_MODEL),
74
+ temperature=self.config.get("temperature", config.LLM_TEMPERATURE),
75
+ max_tokens=self.config.get("max_tokens", config.LLM_MAX_TOKENS)
76
+ )
77
+
78
+ # Set up the model that converts text into numbers (embeddings)
79
+ logger.info("Setting up text embedding model...")
80
+ self.embed_model = HuggingFaceEmbedding(
81
+ model_name=self.config.get("embedding_model", config.EMBEDDING_MODEL),
82
+ device=self.config.get("device", config.EMBEDDING_DEVICE),
83
+ embed_batch_size=self.config.get("embed_batch_size", config.EMBEDDING_BATCH_SIZE)
84
+ )
85
+
86
+ # Configure all the settings for the chatbot
87
+ logger.info("Configuring chatbot settings...")
88
+ Settings.embed_model = self.embed_model
89
+ Settings.text_splitter = SentenceSplitter(
90
+ chunk_size=self.config.get("chunk_size", config.CHUNK_SIZE),
91
+ chunk_overlap=self.config.get("chunk_overlap", config.CHUNK_OVERLAP),
92
+ paragraph_separator="\n\n"
93
+ )
94
+ Settings.llm = self.llm
95
+ Settings.callback_manager = self.callback_manager
96
+
97
+ logger.info("Components initialized successfully")
98
+
99
+ except Exception as e:
100
+ logger.error(f"Error initializing components: {e}")
101
+ raise
102
+
103
+ def load_documents(self, data_dir: str = None) -> List:
104
+ """Load documents with retry logic.
105
+
106
+ Args:
107
+ data_dir: Directory containing documents to load. If None, uses default.
108
+
109
+ Returns:
110
+ List of loaded documents
111
+
112
+ Raises:
113
+ Exception: If document loading fails after retries
114
+ """
115
+ # Try to load documents up to 3 times if there's an error
116
+ max_retries = 3
117
+ retry_delay = 1
118
+ data_dir = data_dir or config.DATA_DIR
119
+
120
+ for attempt in range(max_retries):
121
+ try:
122
+ logger.info(f"Loading documents from {data_dir}...")
123
+ documents = SimpleDirectoryReader(data_dir).load_data()
124
+ logger.info(f"Loaded {len(documents)} documents")
125
+ return documents
126
+ except Exception as e:
127
+ if attempt < max_retries - 1:
128
+ logger.warning(f"Attempt {attempt + 1} failed: {e}. Retrying in {retry_delay} seconds...")
129
+ time.sleep(retry_delay)
130
+ else:
131
+ logger.error(f"Failed to load documents after {max_retries} attempts: {e}")
132
+ raise
133
+
134
+ def create_index(self, documents, index_dir: str = None):
135
+ """Create index with error handling.
136
+
137
+ Args:
138
+ documents: Documents to index
139
+ index_dir: Directory to store the index. If None, uses default.
140
+
141
+ Raises:
142
+ Exception: If index creation fails
143
+ """
144
+ index_dir = index_dir or config.INDEX_DIR
145
+ try:
146
+ # Check if index already exists
147
+ if os.path.exists(os.path.join(index_dir, "index_store.json")):
148
+ logger.info("Loading existing index...")
149
+ storage_context = StorageContext.from_defaults(persist_dir=index_dir)
150
+ self.index = load_index_from_storage(storage_context)
151
+ logger.info("Index loaded successfully")
152
+ return
153
+
154
+ # Create a new index if none exists
155
+ logger.info("Creating new index...")
156
+ with tqdm(total=1, desc="Creating searchable index") as pbar:
157
+ self.index = VectorStoreIndex.from_documents(documents)
158
+ # Save the index
159
+ self.index.storage_context.persist(persist_dir=index_dir)
160
+ pbar.update(1)
161
+ logger.info("Index created and saved successfully")
162
+ except Exception as e:
163
+ logger.error(f"Error creating/loading index: {e}")
164
+ raise
165
+
166
+ def initialize_query_engine(self):
167
+ """Initialize query engine with error handling.
168
+
169
+ Sets up the query engine from the index.
170
+
171
+ Raises:
172
+ Exception: If query engine initialization fails
173
+ """
174
+ try:
175
+ # Set up the system that will handle questions
176
+ logger.info("Initializing query engine...")
177
+ self.query_engine = self.index.as_query_engine()
178
+ logger.info("Query engine initialized successfully")
179
+ except Exception as e:
180
+ logger.error(f"Error initializing query engine: {e}")
181
+ raise
182
+
183
+ def query(self, query_text: str) -> str:
184
+ """Execute a query with error handling and retries.
185
+
186
+ Args:
187
+ query_text: The question to answer
188
+
189
+ Returns:
190
+ Response as string
191
+
192
+ Raises:
193
+ Exception: If query fails after retries
194
+ """
195
+ # Try to answer questions up to 3 times if there's an error
196
+ max_retries = 3
197
+ retry_delay = 1
198
+
199
+ for attempt in range(max_retries):
200
+ try:
201
+ logger.info(f"Executing query: {query_text}")
202
+ response = self.query_engine.query(query_text)
203
+ logger.info("Query executed successfully")
204
+ return str(response)
205
+ except Exception as e:
206
+ if attempt < max_retries - 1:
207
+ logger.warning(f"Attempt {attempt + 1} failed: {e}. Retrying in {retry_delay} seconds...")
208
+ time.sleep(retry_delay)
209
+ else:
210
+ logger.error(f"Failed to execute query after {max_retries} attempts: {e}")
211
+ raise
212
+
213
+ def cleanup(self):
214
+ """Clean up resources.
215
+
216
+ Performs any necessary cleanup operations.
217
+ """
218
+ try:
219
+ # Clean up any resources we used
220
+ logger.info("Cleaning up resources...")
221
+ logger.info("Cleanup completed successfully")
222
+ except Exception as e:
223
+ logger.error(f"Error during cleanup: {e}")
config.py ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Configuration settings for the chatbot application.
3
+ Loads from environment variables with sensible defaults.
4
+ """
5
+ import os
6
+ from dotenv import load_dotenv
7
+
8
+ # Load environment variables from .env file if it exists
9
+ load_dotenv()
10
+
11
+ # Paths
12
+ DATA_DIR = os.getenv("DATA_DIR", "data")
13
+ INDEX_DIR = os.getenv("INDEX_DIR", "index")
14
+
15
+ # LLM Configuration
16
+ ANTHROPIC_API_KEY = os.getenv("ANTHROPIC_API_KEY")
17
+ LLM_MODEL = os.getenv("LLM_MODEL", "claude-3-7-sonnet-20250219")
18
+ LLM_TEMPERATURE = float(os.getenv("LLM_TEMPERATURE", "0.1"))
19
+ LLM_MAX_TOKENS = int(os.getenv("LLM_MAX_TOKENS", "2048"))
20
+
21
+ # Embedding Configuration
22
+ EMBEDDING_MODEL = os.getenv("EMBEDDING_MODEL", "sentence-transformers/all-MiniLM-L6-v2")
23
+ EMBEDDING_DEVICE = os.getenv("EMBEDDING_DEVICE", "cpu")
24
+ EMBEDDING_BATCH_SIZE = int(os.getenv("EMBEDDING_BATCH_SIZE", "8"))
25
+
26
+ # Document Processing Configuration
27
+ CHUNK_SIZE = int(os.getenv("CHUNK_SIZE", "1024"))
28
+ CHUNK_OVERLAP = int(os.getenv("CHUNK_OVERLAP", "100"))
29
+
30
+ # Database Configuration (for future use)
31
+ DATABASE_URL = os.getenv("DATABASE_URL", "sqlite:///chatbot.db")
32
+
33
+ # Debug and logging
34
+ DEBUG = os.getenv("DEBUG", "False").lower() == "true"
35
+ LOG_LEVEL = os.getenv("LOG_LEVEL", "INFO")
36
+
37
+ # Function to get configuration as a dictionary
38
+ def get_chatbot_config():
39
+ """Return chatbot configuration as a dictionary"""
40
+ return {
41
+ "model": LLM_MODEL,
42
+ "temperature": LLM_TEMPERATURE,
43
+ "max_tokens": LLM_MAX_TOKENS,
44
+ "embedding_model": EMBEDDING_MODEL,
45
+ "device": EMBEDDING_DEVICE,
46
+ "embed_batch_size": EMBEDDING_BATCH_SIZE,
47
+ "chunk_size": CHUNK_SIZE,
48
+ "chunk_overlap": CHUNK_OVERLAP,
49
+ }
database/__init__.py ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ """
2
+ Database models and operations for the chatbot application.
3
+ """
feedback/__init__.py ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ """
2
+ Feedback collection and analysis module for the chatbot.
3
+ """
frontend.py ADDED
@@ -0,0 +1,132 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Frontend Streamlit application for the chatbot.
3
+ """
4
+ import streamlit as st
5
+ # Import directly from backend.py instead of the chatbot module
6
+ from backend import Chatbot
7
+ from utils.logging_config import setup_logging
8
+ import config
9
+
10
+ # Setup logging
11
+ logger = setup_logging()
12
+
13
+ # Set page config
14
+ st.set_page_config(
15
+ page_title="Document Chatbot",
16
+ page_icon="📚",
17
+ layout="centered",
18
+ initial_sidebar_state="collapsed"
19
+ )
20
+
21
+ # Hide Streamlit's default elements and style sidebar
22
+ st.markdown("""
23
+ <style>
24
+ /* Hide default elements */
25
+ button[kind="deploy"],
26
+ [data-testid="stToolbar"],
27
+ .stDeployButton,
28
+ #MainMenu,
29
+ footer {
30
+ display: none !important;
31
+ }
32
+
33
+ /* Style the sidebar */
34
+ [data-testid="stSidebar"] {
35
+ width: 120px !important;
36
+ background-color: #0e1117 !important;
37
+ border-right: 1px solid #1e1e1e !important;
38
+ }
39
+
40
+ /* Style the collapse button to prevent movement and maintain size */
41
+ button[kind="menuButton"] {
42
+ left: 120px !important;
43
+ margin-left: 0 !important;
44
+ position: fixed !important;
45
+ transform: translateX(0) !important;
46
+ transition: none !important;
47
+ background-color: #0e1117 !important;
48
+ color: #fff !important;
49
+ z-index: 999 !important;
50
+ }
51
+
52
+ /* Button styling */
53
+ [data-testid="stSidebar"] [data-testid="stButton"] button {
54
+ background-color: #262730 !important;
55
+ color: white !important;
56
+ padding: 8px 10px !important;
57
+ width: 100px !important;
58
+ font-size: 0.9rem !important;
59
+ margin: 1rem auto !important;
60
+ display: block !important;
61
+ border-radius: 4px !important;
62
+ white-space: nowrap !important;
63
+ }
64
+
65
+ [data-testid="stSidebar"] [data-testid="stButton"] button p {
66
+ text-align: center !important;
67
+ white-space: nowrap !important;
68
+ overflow: visible !important;
69
+ }
70
+
71
+ [data-testid="stSidebar"] [data-testid="stButton"] button:hover {
72
+ background-color: #1E2130 !important;
73
+ }
74
+ </style>
75
+ """, unsafe_allow_html=True)
76
+
77
+ # Create a sidebar with just the button
78
+ with st.sidebar:
79
+ def clear_chat():
80
+ st.session_state.messages = []
81
+ logger.info("Chat history cleared via sidebar button")
82
+ st.rerun()
83
+
84
+ st.button("Clear Chat", on_click=clear_chat)
85
+
86
+ # Initialize session state for chat history
87
+ if "messages" not in st.session_state:
88
+ st.session_state.messages = []
89
+
90
+ # Initialize chatbot
91
+ if "chatbot" not in st.session_state:
92
+ with st.spinner("Initializing chatbot..."):
93
+ # Get configuration from config module
94
+ chatbot_config = config.get_chatbot_config()
95
+
96
+ # Initialize chatbot
97
+ logger.info("Initializing chatbot...")
98
+ st.session_state.chatbot = Chatbot(chatbot_config)
99
+
100
+ # Load documents and create index
101
+ documents = st.session_state.chatbot.load_documents()
102
+ st.session_state.chatbot.create_index(documents)
103
+ st.session_state.chatbot.initialize_query_engine()
104
+ logger.info("Chatbot initialized successfully")
105
+
106
+ # Title and description
107
+ st.title("📚 Document Chatbot")
108
+ st.markdown("""
109
+ This chatbot can answer questions about your documents.
110
+ Ask any question about the content in your documents!
111
+ """)
112
+
113
+ # Display chat messages
114
+ for message in st.session_state.messages:
115
+ with st.chat_message(message["role"]):
116
+ st.markdown(message["content"])
117
+
118
+ # Chat input
119
+ if prompt := st.chat_input("What would you like to know?"):
120
+ # Add user message to chat history
121
+ st.session_state.messages.append({"role": "user", "content": prompt})
122
+ with st.chat_message("user"):
123
+ st.markdown(prompt)
124
+
125
+ # Get chatbot response
126
+ with st.chat_message("assistant"):
127
+ with st.spinner("Thinking..."):
128
+ logger.info(f"User query: {prompt}")
129
+ response = st.session_state.chatbot.query(prompt)
130
+ st.markdown(response)
131
+ st.session_state.messages.append({"role": "assistant", "content": response})
132
+ logger.info("Response provided to user")
myenv/bin/Activate.ps1 ADDED
@@ -0,0 +1,247 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <#
2
+ .Synopsis
3
+ Activate a Python virtual environment for the current PowerShell session.
4
+
5
+ .Description
6
+ Pushes the python executable for a virtual environment to the front of the
7
+ $Env:PATH environment variable and sets the prompt to signify that you are
8
+ in a Python virtual environment. Makes use of the command line switches as
9
+ well as the `pyvenv.cfg` file values present in the virtual environment.
10
+
11
+ .Parameter VenvDir
12
+ Path to the directory that contains the virtual environment to activate. The
13
+ default value for this is the parent of the directory that the Activate.ps1
14
+ script is located within.
15
+
16
+ .Parameter Prompt
17
+ The prompt prefix to display when this virtual environment is activated. By
18
+ default, this prompt is the name of the virtual environment folder (VenvDir)
19
+ surrounded by parentheses and followed by a single space (ie. '(.venv) ').
20
+
21
+ .Example
22
+ Activate.ps1
23
+ Activates the Python virtual environment that contains the Activate.ps1 script.
24
+
25
+ .Example
26
+ Activate.ps1 -Verbose
27
+ Activates the Python virtual environment that contains the Activate.ps1 script,
28
+ and shows extra information about the activation as it executes.
29
+
30
+ .Example
31
+ Activate.ps1 -VenvDir C:\Users\MyUser\Common\.venv
32
+ Activates the Python virtual environment located in the specified location.
33
+
34
+ .Example
35
+ Activate.ps1 -Prompt "MyPython"
36
+ Activates the Python virtual environment that contains the Activate.ps1 script,
37
+ and prefixes the current prompt with the specified string (surrounded in
38
+ parentheses) while the virtual environment is active.
39
+
40
+ .Notes
41
+ On Windows, it may be required to enable this Activate.ps1 script by setting the
42
+ execution policy for the user. You can do this by issuing the following PowerShell
43
+ command:
44
+
45
+ PS C:\> Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
46
+
47
+ For more information on Execution Policies:
48
+ https://go.microsoft.com/fwlink/?LinkID=135170
49
+
50
+ #>
51
+ Param(
52
+ [Parameter(Mandatory = $false)]
53
+ [String]
54
+ $VenvDir,
55
+ [Parameter(Mandatory = $false)]
56
+ [String]
57
+ $Prompt
58
+ )
59
+
60
+ <# Function declarations --------------------------------------------------- #>
61
+
62
+ <#
63
+ .Synopsis
64
+ Remove all shell session elements added by the Activate script, including the
65
+ addition of the virtual environment's Python executable from the beginning of
66
+ the PATH variable.
67
+
68
+ .Parameter NonDestructive
69
+ If present, do not remove this function from the global namespace for the
70
+ session.
71
+
72
+ #>
73
+ function global:deactivate ([switch]$NonDestructive) {
74
+ # Revert to original values
75
+
76
+ # The prior prompt:
77
+ if (Test-Path -Path Function:_OLD_VIRTUAL_PROMPT) {
78
+ Copy-Item -Path Function:_OLD_VIRTUAL_PROMPT -Destination Function:prompt
79
+ Remove-Item -Path Function:_OLD_VIRTUAL_PROMPT
80
+ }
81
+
82
+ # The prior PYTHONHOME:
83
+ if (Test-Path -Path Env:_OLD_VIRTUAL_PYTHONHOME) {
84
+ Copy-Item -Path Env:_OLD_VIRTUAL_PYTHONHOME -Destination Env:PYTHONHOME
85
+ Remove-Item -Path Env:_OLD_VIRTUAL_PYTHONHOME
86
+ }
87
+
88
+ # The prior PATH:
89
+ if (Test-Path -Path Env:_OLD_VIRTUAL_PATH) {
90
+ Copy-Item -Path Env:_OLD_VIRTUAL_PATH -Destination Env:PATH
91
+ Remove-Item -Path Env:_OLD_VIRTUAL_PATH
92
+ }
93
+
94
+ # Just remove the VIRTUAL_ENV altogether:
95
+ if (Test-Path -Path Env:VIRTUAL_ENV) {
96
+ Remove-Item -Path env:VIRTUAL_ENV
97
+ }
98
+
99
+ # Just remove VIRTUAL_ENV_PROMPT altogether.
100
+ if (Test-Path -Path Env:VIRTUAL_ENV_PROMPT) {
101
+ Remove-Item -Path env:VIRTUAL_ENV_PROMPT
102
+ }
103
+
104
+ # Just remove the _PYTHON_VENV_PROMPT_PREFIX altogether:
105
+ if (Get-Variable -Name "_PYTHON_VENV_PROMPT_PREFIX" -ErrorAction SilentlyContinue) {
106
+ Remove-Variable -Name _PYTHON_VENV_PROMPT_PREFIX -Scope Global -Force
107
+ }
108
+
109
+ # Leave deactivate function in the global namespace if requested:
110
+ if (-not $NonDestructive) {
111
+ Remove-Item -Path function:deactivate
112
+ }
113
+ }
114
+
115
+ <#
116
+ .Description
117
+ Get-PyVenvConfig parses the values from the pyvenv.cfg file located in the
118
+ given folder, and returns them in a map.
119
+
120
+ For each line in the pyvenv.cfg file, if that line can be parsed into exactly
121
+ two strings separated by `=` (with any amount of whitespace surrounding the =)
122
+ then it is considered a `key = value` line. The left hand string is the key,
123
+ the right hand is the value.
124
+
125
+ If the value starts with a `'` or a `"` then the first and last character is
126
+ stripped from the value before being captured.
127
+
128
+ .Parameter ConfigDir
129
+ Path to the directory that contains the `pyvenv.cfg` file.
130
+ #>
131
+ function Get-PyVenvConfig(
132
+ [String]
133
+ $ConfigDir
134
+ ) {
135
+ Write-Verbose "Given ConfigDir=$ConfigDir, obtain values in pyvenv.cfg"
136
+
137
+ # Ensure the file exists, and issue a warning if it doesn't (but still allow the function to continue).
138
+ $pyvenvConfigPath = Join-Path -Resolve -Path $ConfigDir -ChildPath 'pyvenv.cfg' -ErrorAction Continue
139
+
140
+ # An empty map will be returned if no config file is found.
141
+ $pyvenvConfig = @{ }
142
+
143
+ if ($pyvenvConfigPath) {
144
+
145
+ Write-Verbose "File exists, parse `key = value` lines"
146
+ $pyvenvConfigContent = Get-Content -Path $pyvenvConfigPath
147
+
148
+ $pyvenvConfigContent | ForEach-Object {
149
+ $keyval = $PSItem -split "\s*=\s*", 2
150
+ if ($keyval[0] -and $keyval[1]) {
151
+ $val = $keyval[1]
152
+
153
+ # Remove extraneous quotations around a string value.
154
+ if ("'""".Contains($val.Substring(0, 1))) {
155
+ $val = $val.Substring(1, $val.Length - 2)
156
+ }
157
+
158
+ $pyvenvConfig[$keyval[0]] = $val
159
+ Write-Verbose "Adding Key: '$($keyval[0])'='$val'"
160
+ }
161
+ }
162
+ }
163
+ return $pyvenvConfig
164
+ }
165
+
166
+
167
+ <# Begin Activate script --------------------------------------------------- #>
168
+
169
+ # Determine the containing directory of this script
170
+ $VenvExecPath = Split-Path -Parent $MyInvocation.MyCommand.Definition
171
+ $VenvExecDir = Get-Item -Path $VenvExecPath
172
+
173
+ Write-Verbose "Activation script is located in path: '$VenvExecPath'"
174
+ Write-Verbose "VenvExecDir Fullname: '$($VenvExecDir.FullName)"
175
+ Write-Verbose "VenvExecDir Name: '$($VenvExecDir.Name)"
176
+
177
+ # Set values required in priority: CmdLine, ConfigFile, Default
178
+ # First, get the location of the virtual environment, it might not be
179
+ # VenvExecDir if specified on the command line.
180
+ if ($VenvDir) {
181
+ Write-Verbose "VenvDir given as parameter, using '$VenvDir' to determine values"
182
+ }
183
+ else {
184
+ Write-Verbose "VenvDir not given as a parameter, using parent directory name as VenvDir."
185
+ $VenvDir = $VenvExecDir.Parent.FullName.TrimEnd("\\/")
186
+ Write-Verbose "VenvDir=$VenvDir"
187
+ }
188
+
189
+ # Next, read the `pyvenv.cfg` file to determine any required value such
190
+ # as `prompt`.
191
+ $pyvenvCfg = Get-PyVenvConfig -ConfigDir $VenvDir
192
+
193
+ # Next, set the prompt from the command line, or the config file, or
194
+ # just use the name of the virtual environment folder.
195
+ if ($Prompt) {
196
+ Write-Verbose "Prompt specified as argument, using '$Prompt'"
197
+ }
198
+ else {
199
+ Write-Verbose "Prompt not specified as argument to script, checking pyvenv.cfg value"
200
+ if ($pyvenvCfg -and $pyvenvCfg['prompt']) {
201
+ Write-Verbose " Setting based on value in pyvenv.cfg='$($pyvenvCfg['prompt'])'"
202
+ $Prompt = $pyvenvCfg['prompt'];
203
+ }
204
+ else {
205
+ Write-Verbose " Setting prompt based on parent's directory's name. (Is the directory name passed to venv module when creating the virtual environment)"
206
+ Write-Verbose " Got leaf-name of $VenvDir='$(Split-Path -Path $venvDir -Leaf)'"
207
+ $Prompt = Split-Path -Path $venvDir -Leaf
208
+ }
209
+ }
210
+
211
+ Write-Verbose "Prompt = '$Prompt'"
212
+ Write-Verbose "VenvDir='$VenvDir'"
213
+
214
+ # Deactivate any currently active virtual environment, but leave the
215
+ # deactivate function in place.
216
+ deactivate -nondestructive
217
+
218
+ # Now set the environment variable VIRTUAL_ENV, used by many tools to determine
219
+ # that there is an activated venv.
220
+ $env:VIRTUAL_ENV = $VenvDir
221
+
222
+ if (-not $Env:VIRTUAL_ENV_DISABLE_PROMPT) {
223
+
224
+ Write-Verbose "Setting prompt to '$Prompt'"
225
+
226
+ # Set the prompt to include the env name
227
+ # Make sure _OLD_VIRTUAL_PROMPT is global
228
+ function global:_OLD_VIRTUAL_PROMPT { "" }
229
+ Copy-Item -Path function:prompt -Destination function:_OLD_VIRTUAL_PROMPT
230
+ New-Variable -Name _PYTHON_VENV_PROMPT_PREFIX -Description "Python virtual environment prompt prefix" -Scope Global -Option ReadOnly -Visibility Public -Value $Prompt
231
+
232
+ function global:prompt {
233
+ Write-Host -NoNewline -ForegroundColor Green "($_PYTHON_VENV_PROMPT_PREFIX) "
234
+ _OLD_VIRTUAL_PROMPT
235
+ }
236
+ $env:VIRTUAL_ENV_PROMPT = $Prompt
237
+ }
238
+
239
+ # Clear PYTHONHOME
240
+ if (Test-Path -Path Env:PYTHONHOME) {
241
+ Copy-Item -Path Env:PYTHONHOME -Destination Env:_OLD_VIRTUAL_PYTHONHOME
242
+ Remove-Item -Path Env:PYTHONHOME
243
+ }
244
+
245
+ # Add the venv to the PATH
246
+ Copy-Item -Path Env:PATH -Destination Env:_OLD_VIRTUAL_PATH
247
+ $Env:PATH = "$VenvExecDir$([System.IO.Path]::PathSeparator)$Env:PATH"
myenv/bin/activate ADDED
@@ -0,0 +1,63 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # This file must be used with "source bin/activate" *from bash*
2
+ # you cannot run it directly
3
+
4
+ deactivate () {
5
+ # reset old environment variables
6
+ if [ -n "${_OLD_VIRTUAL_PATH:-}" ] ; then
7
+ PATH="${_OLD_VIRTUAL_PATH:-}"
8
+ export PATH
9
+ unset _OLD_VIRTUAL_PATH
10
+ fi
11
+ if [ -n "${_OLD_VIRTUAL_PYTHONHOME:-}" ] ; then
12
+ PYTHONHOME="${_OLD_VIRTUAL_PYTHONHOME:-}"
13
+ export PYTHONHOME
14
+ unset _OLD_VIRTUAL_PYTHONHOME
15
+ fi
16
+
17
+ # Call hash to forget past commands. Without forgetting
18
+ # past commands the $PATH changes we made may not be respected
19
+ hash -r 2> /dev/null
20
+
21
+ if [ -n "${_OLD_VIRTUAL_PS1:-}" ] ; then
22
+ PS1="${_OLD_VIRTUAL_PS1:-}"
23
+ export PS1
24
+ unset _OLD_VIRTUAL_PS1
25
+ fi
26
+
27
+ unset VIRTUAL_ENV
28
+ unset VIRTUAL_ENV_PROMPT
29
+ if [ ! "${1:-}" = "nondestructive" ] ; then
30
+ # Self destruct!
31
+ unset -f deactivate
32
+ fi
33
+ }
34
+
35
+ # unset irrelevant variables
36
+ deactivate nondestructive
37
+
38
+ VIRTUAL_ENV="/Users/paulmagee/llamaindex-demo-1/myenv"
39
+ export VIRTUAL_ENV
40
+
41
+ _OLD_VIRTUAL_PATH="$PATH"
42
+ PATH="$VIRTUAL_ENV/bin:$PATH"
43
+ export PATH
44
+
45
+ # unset PYTHONHOME if set
46
+ # this will fail if PYTHONHOME is set to the empty string (which is bad anyway)
47
+ # could use `if (set -u; : $PYTHONHOME) ;` in bash
48
+ if [ -n "${PYTHONHOME:-}" ] ; then
49
+ _OLD_VIRTUAL_PYTHONHOME="${PYTHONHOME:-}"
50
+ unset PYTHONHOME
51
+ fi
52
+
53
+ if [ -z "${VIRTUAL_ENV_DISABLE_PROMPT:-}" ] ; then
54
+ _OLD_VIRTUAL_PS1="${PS1:-}"
55
+ PS1="(myenv) ${PS1:-}"
56
+ export PS1
57
+ VIRTUAL_ENV_PROMPT="(myenv) "
58
+ export VIRTUAL_ENV_PROMPT
59
+ fi
60
+
61
+ # Call hash to forget past commands. Without forgetting
62
+ # past commands the $PATH changes we made may not be respected
63
+ hash -r 2> /dev/null
myenv/bin/activate.csh ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # This file must be used with "source bin/activate.csh" *from csh*.
2
+ # You cannot run it directly.
3
+ # Created by Davide Di Blasi <davidedb@gmail.com>.
4
+ # Ported to Python 3.3 venv by Andrew Svetlov <andrew.svetlov@gmail.com>
5
+
6
+ alias deactivate 'test $?_OLD_VIRTUAL_PATH != 0 && setenv PATH "$_OLD_VIRTUAL_PATH" && unset _OLD_VIRTUAL_PATH; rehash; test $?_OLD_VIRTUAL_PROMPT != 0 && set prompt="$_OLD_VIRTUAL_PROMPT" && unset _OLD_VIRTUAL_PROMPT; unsetenv VIRTUAL_ENV; unsetenv VIRTUAL_ENV_PROMPT; test "\!:*" != "nondestructive" && unalias deactivate'
7
+
8
+ # Unset irrelevant variables.
9
+ deactivate nondestructive
10
+
11
+ setenv VIRTUAL_ENV "/Users/paulmagee/llamaindex-demo-1/myenv"
12
+
13
+ set _OLD_VIRTUAL_PATH="$PATH"
14
+ setenv PATH "$VIRTUAL_ENV/bin:$PATH"
15
+
16
+
17
+ set _OLD_VIRTUAL_PROMPT="$prompt"
18
+
19
+ if (! "$?VIRTUAL_ENV_DISABLE_PROMPT") then
20
+ set prompt = "(myenv) $prompt"
21
+ setenv VIRTUAL_ENV_PROMPT "(myenv) "
22
+ endif
23
+
24
+ alias pydoc python -m pydoc
25
+
26
+ rehash
myenv/bin/activate.fish ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # This file must be used with "source <venv>/bin/activate.fish" *from fish*
2
+ # (https://fishshell.com/); you cannot run it directly.
3
+
4
+ function deactivate -d "Exit virtual environment and return to normal shell environment"
5
+ # reset old environment variables
6
+ if test -n "$_OLD_VIRTUAL_PATH"
7
+ set -gx PATH $_OLD_VIRTUAL_PATH
8
+ set -e _OLD_VIRTUAL_PATH
9
+ end
10
+ if test -n "$_OLD_VIRTUAL_PYTHONHOME"
11
+ set -gx PYTHONHOME $_OLD_VIRTUAL_PYTHONHOME
12
+ set -e _OLD_VIRTUAL_PYTHONHOME
13
+ end
14
+
15
+ if test -n "$_OLD_FISH_PROMPT_OVERRIDE"
16
+ set -e _OLD_FISH_PROMPT_OVERRIDE
17
+ # prevents error when using nested fish instances (Issue #93858)
18
+ if functions -q _old_fish_prompt
19
+ functions -e fish_prompt
20
+ functions -c _old_fish_prompt fish_prompt
21
+ functions -e _old_fish_prompt
22
+ end
23
+ end
24
+
25
+ set -e VIRTUAL_ENV
26
+ set -e VIRTUAL_ENV_PROMPT
27
+ if test "$argv[1]" != "nondestructive"
28
+ # Self-destruct!
29
+ functions -e deactivate
30
+ end
31
+ end
32
+
33
+ # Unset irrelevant variables.
34
+ deactivate nondestructive
35
+
36
+ set -gx VIRTUAL_ENV "/Users/paulmagee/llamaindex-demo-1/myenv"
37
+
38
+ set -gx _OLD_VIRTUAL_PATH $PATH
39
+ set -gx PATH "$VIRTUAL_ENV/bin" $PATH
40
+
41
+ # Unset PYTHONHOME if set.
42
+ if set -q PYTHONHOME
43
+ set -gx _OLD_VIRTUAL_PYTHONHOME $PYTHONHOME
44
+ set -e PYTHONHOME
45
+ end
46
+
47
+ if test -z "$VIRTUAL_ENV_DISABLE_PROMPT"
48
+ # fish uses a function instead of an env var to generate the prompt.
49
+
50
+ # Save the current fish_prompt function as the function _old_fish_prompt.
51
+ functions -c fish_prompt _old_fish_prompt
52
+
53
+ # With the original prompt function renamed, we can override with our own.
54
+ function fish_prompt
55
+ # Save the return status of the last command.
56
+ set -l old_status $status
57
+
58
+ # Output the venv prompt; color taken from the blue of the Python logo.
59
+ printf "%s%s%s" (set_color 4B8BBE) "(myenv) " (set_color normal)
60
+
61
+ # Restore the return status of the previous command.
62
+ echo "exit $old_status" | .
63
+ # Output the original/"old" prompt.
64
+ _old_fish_prompt
65
+ end
66
+
67
+ set -gx _OLD_FISH_PROMPT_OVERRIDE "$VIRTUAL_ENV"
68
+ set -gx VIRTUAL_ENV_PROMPT "(myenv) "
69
+ end
myenv/bin/convert-caffe2-to-onnx ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ #!/Users/paulmagee/llamaindex-demo-1/myenv/bin/python
2
+ # -*- coding: utf-8 -*-
3
+ import re
4
+ import sys
5
+ from caffe2.python.onnx.bin.conversion import caffe2_to_onnx
6
+ if __name__ == '__main__':
7
+ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
8
+ sys.exit(caffe2_to_onnx())
myenv/bin/convert-onnx-to-caffe2 ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ #!/Users/paulmagee/llamaindex-demo-1/myenv/bin/python
2
+ # -*- coding: utf-8 -*-
3
+ import re
4
+ import sys
5
+ from caffe2.python.onnx.bin.conversion import onnx_to_caffe2
6
+ if __name__ == '__main__':
7
+ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
8
+ sys.exit(onnx_to_caffe2())
myenv/bin/distro ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ #!/Users/paulmagee/llamaindex-demo-1/myenv/bin/python
2
+ # -*- coding: utf-8 -*-
3
+ import re
4
+ import sys
5
+ from distro.distro import main
6
+ if __name__ == '__main__':
7
+ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
8
+ sys.exit(main())
myenv/bin/dotenv ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ #!/Users/paulmagee/llamaindex-demo-1/myenv/bin/python
2
+ # -*- coding: utf-8 -*-
3
+ import re
4
+ import sys
5
+ from dotenv.__main__ import cli
6
+ if __name__ == '__main__':
7
+ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
8
+ sys.exit(cli())
myenv/bin/f2py ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ #!/Users/paulmagee/llamaindex-demo-1/myenv/bin/python
2
+ # -*- coding: utf-8 -*-
3
+ import re
4
+ import sys
5
+ from numpy.f2py.f2py2e import main
6
+ if __name__ == '__main__':
7
+ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
8
+ sys.exit(main())
myenv/bin/filetype ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ #!/Users/paulmagee/llamaindex-demo-1/myenv/bin/python
2
+ # -*- coding: utf-8 -*-
3
+ import re
4
+ import sys
5
+ from filetype.__main__ import main
6
+ if __name__ == '__main__':
7
+ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
8
+ sys.exit(main())
myenv/bin/griffe ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ #!/Users/paulmagee/llamaindex-demo-1/myenv/bin/python
2
+ # -*- coding: utf-8 -*-
3
+ import re
4
+ import sys
5
+ from griffe import main
6
+ if __name__ == '__main__':
7
+ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
8
+ sys.exit(main())
myenv/bin/httpx ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ #!/Users/paulmagee/llamaindex-demo-1/myenv/bin/python
2
+ # -*- coding: utf-8 -*-
3
+ import re
4
+ import sys
5
+ from httpx import main
6
+ if __name__ == '__main__':
7
+ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
8
+ sys.exit(main())
myenv/bin/huggingface-cli ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ #!/Users/paulmagee/llamaindex-demo-1/myenv/bin/python
2
+ # -*- coding: utf-8 -*-
3
+ import re
4
+ import sys
5
+ from huggingface_hub.commands.huggingface_cli import main
6
+ if __name__ == '__main__':
7
+ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
8
+ sys.exit(main())
myenv/bin/isympy ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ #!/Users/paulmagee/llamaindex-demo-1/myenv/bin/python
2
+ # -*- coding: utf-8 -*-
3
+ import re
4
+ import sys
5
+ from isympy import main
6
+ if __name__ == '__main__':
7
+ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
8
+ sys.exit(main())
myenv/bin/jp.py ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/Users/paulmagee/llamaindex-demo-1/myenv/bin/python
2
+
3
+ import sys
4
+ import json
5
+ import argparse
6
+ from pprint import pformat
7
+
8
+ import jmespath
9
+ from jmespath import exceptions
10
+
11
+
12
+ def main():
13
+ parser = argparse.ArgumentParser()
14
+ parser.add_argument('expression')
15
+ parser.add_argument('-f', '--filename',
16
+ help=('The filename containing the input data. '
17
+ 'If a filename is not given then data is '
18
+ 'read from stdin.'))
19
+ parser.add_argument('--ast', action='store_true',
20
+ help=('Pretty print the AST, do not search the data.'))
21
+ args = parser.parse_args()
22
+ expression = args.expression
23
+ if args.ast:
24
+ # Only print the AST
25
+ expression = jmespath.compile(args.expression)
26
+ sys.stdout.write(pformat(expression.parsed))
27
+ sys.stdout.write('\n')
28
+ return 0
29
+ if args.filename:
30
+ with open(args.filename, 'r') as f:
31
+ data = json.load(f)
32
+ else:
33
+ data = sys.stdin.read()
34
+ data = json.loads(data)
35
+ try:
36
+ sys.stdout.write(json.dumps(
37
+ jmespath.search(expression, data), indent=4, ensure_ascii=False))
38
+ sys.stdout.write('\n')
39
+ except exceptions.ArityError as e:
40
+ sys.stderr.write("invalid-arity: %s\n" % e)
41
+ return 1
42
+ except exceptions.JMESPathTypeError as e:
43
+ sys.stderr.write("invalid-type: %s\n" % e)
44
+ return 1
45
+ except exceptions.UnknownFunctionError as e:
46
+ sys.stderr.write("unknown-function: %s\n" % e)
47
+ return 1
48
+ except exceptions.ParseError as e:
49
+ sys.stderr.write("syntax-error: %s\n" % e)
50
+ return 1
51
+
52
+
53
+ if __name__ == '__main__':
54
+ sys.exit(main())
myenv/bin/nltk ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ #!/Users/paulmagee/llamaindex-demo-1/myenv/bin/python
2
+ # -*- coding: utf-8 -*-
3
+ import re
4
+ import sys
5
+ from nltk.cli import cli
6
+ if __name__ == '__main__':
7
+ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
8
+ sys.exit(cli())
myenv/bin/normalizer ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ #!/Users/paulmagee/llamaindex-demo-1/myenv/bin/python
2
+ # -*- coding: utf-8 -*-
3
+ import re
4
+ import sys
5
+ from charset_normalizer import cli
6
+ if __name__ == '__main__':
7
+ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
8
+ sys.exit(cli.cli_detect())
myenv/bin/pip ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ #!/Users/paulmagee/llamaindex-demo-1/myenv/bin/python
2
+ # -*- coding: utf-8 -*-
3
+ import re
4
+ import sys
5
+ from pip._internal.cli.main import main
6
+ if __name__ == '__main__':
7
+ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
8
+ sys.exit(main())
myenv/bin/pip3 ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ #!/Users/paulmagee/llamaindex-demo-1/myenv/bin/python
2
+ # -*- coding: utf-8 -*-
3
+ import re
4
+ import sys
5
+ from pip._internal.cli.main import main
6
+ if __name__ == '__main__':
7
+ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
8
+ sys.exit(main())
myenv/bin/pip3.11 ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ #!/Users/paulmagee/llamaindex-demo-1/myenv/bin/python
2
+ # -*- coding: utf-8 -*-
3
+ import re
4
+ import sys
5
+ from pip._internal.cli.main import main
6
+ if __name__ == '__main__':
7
+ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
8
+ sys.exit(main())
myenv/bin/pyrsa-decrypt ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ #!/Users/paulmagee/llamaindex-demo-1/myenv/bin/python
2
+ # -*- coding: utf-8 -*-
3
+ import re
4
+ import sys
5
+ from rsa.cli import decrypt
6
+ if __name__ == '__main__':
7
+ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
8
+ sys.exit(decrypt())
myenv/bin/pyrsa-encrypt ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ #!/Users/paulmagee/llamaindex-demo-1/myenv/bin/python
2
+ # -*- coding: utf-8 -*-
3
+ import re
4
+ import sys
5
+ from rsa.cli import encrypt
6
+ if __name__ == '__main__':
7
+ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
8
+ sys.exit(encrypt())
myenv/bin/pyrsa-keygen ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ #!/Users/paulmagee/llamaindex-demo-1/myenv/bin/python
2
+ # -*- coding: utf-8 -*-
3
+ import re
4
+ import sys
5
+ from rsa.cli import keygen
6
+ if __name__ == '__main__':
7
+ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
8
+ sys.exit(keygen())
myenv/bin/pyrsa-priv2pub ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ #!/Users/paulmagee/llamaindex-demo-1/myenv/bin/python
2
+ # -*- coding: utf-8 -*-
3
+ import re
4
+ import sys
5
+ from rsa.util import private_to_public
6
+ if __name__ == '__main__':
7
+ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
8
+ sys.exit(private_to_public())
myenv/bin/pyrsa-sign ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ #!/Users/paulmagee/llamaindex-demo-1/myenv/bin/python
2
+ # -*- coding: utf-8 -*-
3
+ import re
4
+ import sys
5
+ from rsa.cli import sign
6
+ if __name__ == '__main__':
7
+ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
8
+ sys.exit(sign())
myenv/bin/pyrsa-verify ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ #!/Users/paulmagee/llamaindex-demo-1/myenv/bin/python
2
+ # -*- coding: utf-8 -*-
3
+ import re
4
+ import sys
5
+ from rsa.cli import verify
6
+ if __name__ == '__main__':
7
+ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
8
+ sys.exit(verify())
myenv/bin/python ADDED
@@ -0,0 +1 @@
 
 
1
+ /Users/paulmagee/.pyenv/versions/3.11.9/bin/python
myenv/bin/python3 ADDED
@@ -0,0 +1 @@
 
 
1
+ python
myenv/bin/python3.11 ADDED
@@ -0,0 +1 @@
 
 
1
+ python
myenv/bin/torchrun ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ #!/Users/paulmagee/llamaindex-demo-1/myenv/bin/python
2
+ # -*- coding: utf-8 -*-
3
+ import re
4
+ import sys
5
+ from torch.distributed.run import main
6
+ if __name__ == '__main__':
7
+ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
8
+ sys.exit(main())
myenv/bin/tqdm ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ #!/Users/paulmagee/llamaindex-demo-1/myenv/bin/python
2
+ # -*- coding: utf-8 -*-
3
+ import re
4
+ import sys
5
+ from tqdm.cli import main
6
+ if __name__ == '__main__':
7
+ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
8
+ sys.exit(main())
myenv/bin/transformers-cli ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ #!/Users/paulmagee/llamaindex-demo-1/myenv/bin/python
2
+ # -*- coding: utf-8 -*-
3
+ import re
4
+ import sys
5
+ from transformers.commands.transformers_cli import main
6
+ if __name__ == '__main__':
7
+ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
8
+ sys.exit(main())
myenv/include/site/python3.11/greenlet/greenlet.h ADDED
@@ -0,0 +1,164 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ /* -*- indent-tabs-mode: nil; tab-width: 4; -*- */
2
+
3
+ /* Greenlet object interface */
4
+
5
+ #ifndef Py_GREENLETOBJECT_H
6
+ #define Py_GREENLETOBJECT_H
7
+
8
+
9
+ #include <Python.h>
10
+
11
+ #ifdef __cplusplus
12
+ extern "C" {
13
+ #endif
14
+
15
+ /* This is deprecated and undocumented. It does not change. */
16
+ #define GREENLET_VERSION "1.0.0"
17
+
18
+ #ifndef GREENLET_MODULE
19
+ #define implementation_ptr_t void*
20
+ #endif
21
+
22
+ typedef struct _greenlet {
23
+ PyObject_HEAD
24
+ PyObject* weakreflist;
25
+ PyObject* dict;
26
+ implementation_ptr_t pimpl;
27
+ } PyGreenlet;
28
+
29
+ #define PyGreenlet_Check(op) (op && PyObject_TypeCheck(op, &PyGreenlet_Type))
30
+
31
+
32
+ /* C API functions */
33
+
34
+ /* Total number of symbols that are exported */
35
+ #define PyGreenlet_API_pointers 12
36
+
37
+ #define PyGreenlet_Type_NUM 0
38
+ #define PyExc_GreenletError_NUM 1
39
+ #define PyExc_GreenletExit_NUM 2
40
+
41
+ #define PyGreenlet_New_NUM 3
42
+ #define PyGreenlet_GetCurrent_NUM 4
43
+ #define PyGreenlet_Throw_NUM 5
44
+ #define PyGreenlet_Switch_NUM 6
45
+ #define PyGreenlet_SetParent_NUM 7
46
+
47
+ #define PyGreenlet_MAIN_NUM 8
48
+ #define PyGreenlet_STARTED_NUM 9
49
+ #define PyGreenlet_ACTIVE_NUM 10
50
+ #define PyGreenlet_GET_PARENT_NUM 11
51
+
52
+ #ifndef GREENLET_MODULE
53
+ /* This section is used by modules that uses the greenlet C API */
54
+ static void** _PyGreenlet_API = NULL;
55
+
56
+ # define PyGreenlet_Type \
57
+ (*(PyTypeObject*)_PyGreenlet_API[PyGreenlet_Type_NUM])
58
+
59
+ # define PyExc_GreenletError \
60
+ ((PyObject*)_PyGreenlet_API[PyExc_GreenletError_NUM])
61
+
62
+ # define PyExc_GreenletExit \
63
+ ((PyObject*)_PyGreenlet_API[PyExc_GreenletExit_NUM])
64
+
65
+ /*
66
+ * PyGreenlet_New(PyObject *args)
67
+ *
68
+ * greenlet.greenlet(run, parent=None)
69
+ */
70
+ # define PyGreenlet_New \
71
+ (*(PyGreenlet * (*)(PyObject * run, PyGreenlet * parent)) \
72
+ _PyGreenlet_API[PyGreenlet_New_NUM])
73
+
74
+ /*
75
+ * PyGreenlet_GetCurrent(void)
76
+ *
77
+ * greenlet.getcurrent()
78
+ */
79
+ # define PyGreenlet_GetCurrent \
80
+ (*(PyGreenlet * (*)(void)) _PyGreenlet_API[PyGreenlet_GetCurrent_NUM])
81
+
82
+ /*
83
+ * PyGreenlet_Throw(
84
+ * PyGreenlet *greenlet,
85
+ * PyObject *typ,
86
+ * PyObject *val,
87
+ * PyObject *tb)
88
+ *
89
+ * g.throw(...)
90
+ */
91
+ # define PyGreenlet_Throw \
92
+ (*(PyObject * (*)(PyGreenlet * self, \
93
+ PyObject * typ, \
94
+ PyObject * val, \
95
+ PyObject * tb)) \
96
+ _PyGreenlet_API[PyGreenlet_Throw_NUM])
97
+
98
+ /*
99
+ * PyGreenlet_Switch(PyGreenlet *greenlet, PyObject *args)
100
+ *
101
+ * g.switch(*args, **kwargs)
102
+ */
103
+ # define PyGreenlet_Switch \
104
+ (*(PyObject * \
105
+ (*)(PyGreenlet * greenlet, PyObject * args, PyObject * kwargs)) \
106
+ _PyGreenlet_API[PyGreenlet_Switch_NUM])
107
+
108
+ /*
109
+ * PyGreenlet_SetParent(PyObject *greenlet, PyObject *new_parent)
110
+ *
111
+ * g.parent = new_parent
112
+ */
113
+ # define PyGreenlet_SetParent \
114
+ (*(int (*)(PyGreenlet * greenlet, PyGreenlet * nparent)) \
115
+ _PyGreenlet_API[PyGreenlet_SetParent_NUM])
116
+
117
+ /*
118
+ * PyGreenlet_GetParent(PyObject* greenlet)
119
+ *
120
+ * return greenlet.parent;
121
+ *
122
+ * This could return NULL even if there is no exception active.
123
+ * If it does not return NULL, you are responsible for decrementing the
124
+ * reference count.
125
+ */
126
+ # define PyGreenlet_GetParent \
127
+ (*(PyGreenlet* (*)(PyGreenlet*)) \
128
+ _PyGreenlet_API[PyGreenlet_GET_PARENT_NUM])
129
+
130
+ /*
131
+ * deprecated, undocumented alias.
132
+ */
133
+ # define PyGreenlet_GET_PARENT PyGreenlet_GetParent
134
+
135
+ # define PyGreenlet_MAIN \
136
+ (*(int (*)(PyGreenlet*)) \
137
+ _PyGreenlet_API[PyGreenlet_MAIN_NUM])
138
+
139
+ # define PyGreenlet_STARTED \
140
+ (*(int (*)(PyGreenlet*)) \
141
+ _PyGreenlet_API[PyGreenlet_STARTED_NUM])
142
+
143
+ # define PyGreenlet_ACTIVE \
144
+ (*(int (*)(PyGreenlet*)) \
145
+ _PyGreenlet_API[PyGreenlet_ACTIVE_NUM])
146
+
147
+
148
+
149
+
150
+ /* Macro that imports greenlet and initializes C API */
151
+ /* NOTE: This has actually moved to ``greenlet._greenlet._C_API``, but we
152
+ keep the older definition to be sure older code that might have a copy of
153
+ the header still works. */
154
+ # define PyGreenlet_Import() \
155
+ { \
156
+ _PyGreenlet_API = (void**)PyCapsule_Import("greenlet._C_API", 0); \
157
+ }
158
+
159
+ #endif /* GREENLET_MODULE */
160
+
161
+ #ifdef __cplusplus
162
+ }
163
+ #endif
164
+ #endif /* !Py_GREENLETOBJECT_H */
myenv/pyvenv.cfg ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ home = /Users/paulmagee/.pyenv/versions/3.11.9/bin
2
+ include-system-site-packages = false
3
+ version = 3.11.9
4
+ executable = /Users/paulmagee/.pyenv/versions/3.11.9/bin/python3.11
5
+ command = /Users/paulmagee/.pyenv/versions/3.11.9/bin/python -m venv /Users/paulmagee/llamaindex-demo-1/myenv
myenv/share/man/man1/isympy.1 ADDED
@@ -0,0 +1,188 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ '\" -*- coding: us-ascii -*-
2
+ .if \n(.g .ds T< \\FC
3
+ .if \n(.g .ds T> \\F[\n[.fam]]
4
+ .de URL
5
+ \\$2 \(la\\$1\(ra\\$3
6
+ ..
7
+ .if \n(.g .mso www.tmac
8
+ .TH isympy 1 2007-10-8 "" ""
9
+ .SH NAME
10
+ isympy \- interactive shell for SymPy
11
+ .SH SYNOPSIS
12
+ 'nh
13
+ .fi
14
+ .ad l
15
+ \fBisympy\fR \kx
16
+ .if (\nx>(\n(.l/2)) .nr x (\n(.l/5)
17
+ 'in \n(.iu+\nxu
18
+ [\fB-c\fR | \fB--console\fR] [\fB-p\fR ENCODING | \fB--pretty\fR ENCODING] [\fB-t\fR TYPE | \fB--types\fR TYPE] [\fB-o\fR ORDER | \fB--order\fR ORDER] [\fB-q\fR | \fB--quiet\fR] [\fB-d\fR | \fB--doctest\fR] [\fB-C\fR | \fB--no-cache\fR] [\fB-a\fR | \fB--auto\fR] [\fB-D\fR | \fB--debug\fR] [
19
+ -- | PYTHONOPTIONS]
20
+ 'in \n(.iu-\nxu
21
+ .ad b
22
+ 'hy
23
+ 'nh
24
+ .fi
25
+ .ad l
26
+ \fBisympy\fR \kx
27
+ .if (\nx>(\n(.l/2)) .nr x (\n(.l/5)
28
+ 'in \n(.iu+\nxu
29
+ [
30
+ {\fB-h\fR | \fB--help\fR}
31
+ |
32
+ {\fB-v\fR | \fB--version\fR}
33
+ ]
34
+ 'in \n(.iu-\nxu
35
+ .ad b
36
+ 'hy
37
+ .SH DESCRIPTION
38
+ isympy is a Python shell for SymPy. It is just a normal python shell
39
+ (ipython shell if you have the ipython package installed) that executes
40
+ the following commands so that you don't have to:
41
+ .PP
42
+ .nf
43
+ \*(T<
44
+ >>> from __future__ import division
45
+ >>> from sympy import *
46
+ >>> x, y, z = symbols("x,y,z")
47
+ >>> k, m, n = symbols("k,m,n", integer=True)
48
+ \*(T>
49
+ .fi
50
+ .PP
51
+ So starting isympy is equivalent to starting python (or ipython) and
52
+ executing the above commands by hand. It is intended for easy and quick
53
+ experimentation with SymPy. For more complicated programs, it is recommended
54
+ to write a script and import things explicitly (using the "from sympy
55
+ import sin, log, Symbol, ..." idiom).
56
+ .SH OPTIONS
57
+ .TP
58
+ \*(T<\fB\-c \fR\*(T>\fISHELL\fR, \*(T<\fB\-\-console=\fR\*(T>\fISHELL\fR
59
+ Use the specified shell (python or ipython) as
60
+ console backend instead of the default one (ipython
61
+ if present or python otherwise).
62
+
63
+ Example: isympy -c python
64
+
65
+ \fISHELL\fR could be either
66
+ \&'ipython' or 'python'
67
+ .TP
68
+ \*(T<\fB\-p \fR\*(T>\fIENCODING\fR, \*(T<\fB\-\-pretty=\fR\*(T>\fIENCODING\fR
69
+ Setup pretty printing in SymPy. By default, the most pretty, unicode
70
+ printing is enabled (if the terminal supports it). You can use less
71
+ pretty ASCII printing instead or no pretty printing at all.
72
+
73
+ Example: isympy -p no
74
+
75
+ \fIENCODING\fR must be one of 'unicode',
76
+ \&'ascii' or 'no'.
77
+ .TP
78
+ \*(T<\fB\-t \fR\*(T>\fITYPE\fR, \*(T<\fB\-\-types=\fR\*(T>\fITYPE\fR
79
+ Setup the ground types for the polys. By default, gmpy ground types
80
+ are used if gmpy2 or gmpy is installed, otherwise it falls back to python
81
+ ground types, which are a little bit slower. You can manually
82
+ choose python ground types even if gmpy is installed (e.g., for testing purposes).
83
+
84
+ Note that sympy ground types are not supported, and should be used
85
+ only for experimental purposes.
86
+
87
+ Note that the gmpy1 ground type is primarily intended for testing; it the
88
+ use of gmpy even if gmpy2 is available.
89
+
90
+ This is the same as setting the environment variable
91
+ SYMPY_GROUND_TYPES to the given ground type (e.g.,
92
+ SYMPY_GROUND_TYPES='gmpy')
93
+
94
+ The ground types can be determined interactively from the variable
95
+ sympy.polys.domains.GROUND_TYPES inside the isympy shell itself.
96
+
97
+ Example: isympy -t python
98
+
99
+ \fITYPE\fR must be one of 'gmpy',
100
+ \&'gmpy1' or 'python'.
101
+ .TP
102
+ \*(T<\fB\-o \fR\*(T>\fIORDER\fR, \*(T<\fB\-\-order=\fR\*(T>\fIORDER\fR
103
+ Setup the ordering of terms for printing. The default is lex, which
104
+ orders terms lexicographically (e.g., x**2 + x + 1). You can choose
105
+ other orderings, such as rev-lex, which will use reverse
106
+ lexicographic ordering (e.g., 1 + x + x**2).
107
+
108
+ Note that for very large expressions, ORDER='none' may speed up
109
+ printing considerably, with the tradeoff that the order of the terms
110
+ in the printed expression will have no canonical order
111
+
112
+ Example: isympy -o rev-lax
113
+
114
+ \fIORDER\fR must be one of 'lex', 'rev-lex', 'grlex',
115
+ \&'rev-grlex', 'grevlex', 'rev-grevlex', 'old', or 'none'.
116
+ .TP
117
+ \*(T<\fB\-q\fR\*(T>, \*(T<\fB\-\-quiet\fR\*(T>
118
+ Print only Python's and SymPy's versions to stdout at startup, and nothing else.
119
+ .TP
120
+ \*(T<\fB\-d\fR\*(T>, \*(T<\fB\-\-doctest\fR\*(T>
121
+ Use the same format that should be used for doctests. This is
122
+ equivalent to '\fIisympy -c python -p no\fR'.
123
+ .TP
124
+ \*(T<\fB\-C\fR\*(T>, \*(T<\fB\-\-no\-cache\fR\*(T>
125
+ Disable the caching mechanism. Disabling the cache may slow certain
126
+ operations down considerably. This is useful for testing the cache,
127
+ or for benchmarking, as the cache can result in deceptive benchmark timings.
128
+
129
+ This is the same as setting the environment variable SYMPY_USE_CACHE
130
+ to 'no'.
131
+ .TP
132
+ \*(T<\fB\-a\fR\*(T>, \*(T<\fB\-\-auto\fR\*(T>
133
+ Automatically create missing symbols. Normally, typing a name of a
134
+ Symbol that has not been instantiated first would raise NameError,
135
+ but with this option enabled, any undefined name will be
136
+ automatically created as a Symbol. This only works in IPython 0.11.
137
+
138
+ Note that this is intended only for interactive, calculator style
139
+ usage. In a script that uses SymPy, Symbols should be instantiated
140
+ at the top, so that it's clear what they are.
141
+
142
+ This will not override any names that are already defined, which
143
+ includes the single character letters represented by the mnemonic
144
+ QCOSINE (see the "Gotchas and Pitfalls" document in the
145
+ documentation). You can delete existing names by executing "del
146
+ name" in the shell itself. You can see if a name is defined by typing
147
+ "'name' in globals()".
148
+
149
+ The Symbols that are created using this have default assumptions.
150
+ If you want to place assumptions on symbols, you should create them
151
+ using symbols() or var().
152
+
153
+ Finally, this only works in the top level namespace. So, for
154
+ example, if you define a function in isympy with an undefined
155
+ Symbol, it will not work.
156
+ .TP
157
+ \*(T<\fB\-D\fR\*(T>, \*(T<\fB\-\-debug\fR\*(T>
158
+ Enable debugging output. This is the same as setting the
159
+ environment variable SYMPY_DEBUG to 'True'. The debug status is set
160
+ in the variable SYMPY_DEBUG within isympy.
161
+ .TP
162
+ -- \fIPYTHONOPTIONS\fR
163
+ These options will be passed on to \fIipython (1)\fR shell.
164
+ Only supported when ipython is being used (standard python shell not supported).
165
+
166
+ Two dashes (--) are required to separate \fIPYTHONOPTIONS\fR
167
+ from the other isympy options.
168
+
169
+ For example, to run iSymPy without startup banner and colors:
170
+
171
+ isympy -q -c ipython -- --colors=NoColor
172
+ .TP
173
+ \*(T<\fB\-h\fR\*(T>, \*(T<\fB\-\-help\fR\*(T>
174
+ Print help output and exit.
175
+ .TP
176
+ \*(T<\fB\-v\fR\*(T>, \*(T<\fB\-\-version\fR\*(T>
177
+ Print isympy version information and exit.
178
+ .SH FILES
179
+ .TP
180
+ \*(T<\fI${HOME}/.sympy\-history\fR\*(T>
181
+ Saves the history of commands when using the python
182
+ shell as backend.
183
+ .SH BUGS
184
+ The upstreams BTS can be found at \(lahttps://github.com/sympy/sympy/issues\(ra
185
+ Please report all bugs that you find in there, this will help improve
186
+ the overall quality of SymPy.
187
+ .SH "SEE ALSO"
188
+ \fBipython\fR(1), \fBpython\fR(1)
requirements.txt ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ llama-index-core>=0.10.0
2
+ llama-index-llms-anthropic>=0.1.0
3
+ llama-index-embeddings-huggingface>=0.1.0
4
+ streamlit>=1.29.0
5
+ anthropic>=0.49.0
6
+ python-dotenv>=1.0.0
7
+ tqdm>=4.66.0
8
+ sqlalchemy>=2.0.0
9
+ sentence-transformers>=2.3.0
10
+ pytest>=7.4.0