Shouvik599 commited on
Commit Β·
22fd41f
0
Parent(s):
Initial clean commit
Browse files- .dockerignore +9 -0
- .gitattributes +1 -0
- .github/workflows/main.yml +21 -0
- .gitignore +0 -0
- Dockerfile +18 -0
- README.md +116 -0
- app.py +133 -0
- books/A-Quran-Translation.pdf +3 -0
- books/Bhagavad-gita-As-It-Is.pdf +3 -0
- books/CSB_Pew_Bible_2nd_Printing.pdf +3 -0
- books/Siri Guru Granth.pdf +3 -0
- env.example +18 -0
- frontend/index.html +626 -0
- ingest.py +179 -0
- rag_chain.py +240 -0
- requirements.txt +26 -0
.dockerignore
ADDED
|
@@ -0,0 +1,9 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
.git
|
| 2 |
+
.gitignore
|
| 3 |
+
.venv
|
| 4 |
+
__pycache__
|
| 5 |
+
chroma_db
|
| 6 |
+
*.md
|
| 7 |
+
env.example
|
| 8 |
+
.dockerignore
|
| 9 |
+
Dockerfile
|
.gitattributes
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
books/*.pdf filter=lfs diff=lfs merge=lfs -text
|
.github/workflows/main.yml
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
name: Sync to Hugging Face hub
|
| 2 |
+
on:
|
| 3 |
+
push:
|
| 4 |
+
branches: [main]
|
| 5 |
+
# Allows you to run this workflow manually from the Actions tab
|
| 6 |
+
workflow_dispatch:
|
| 7 |
+
|
| 8 |
+
jobs:
|
| 9 |
+
sync-to-hub:
|
| 10 |
+
runs-on: ubuntu-latest
|
| 11 |
+
steps:
|
| 12 |
+
- uses: actions/checkout@v3
|
| 13 |
+
with:
|
| 14 |
+
fetch-depth: 0
|
| 15 |
+
lfs: true
|
| 16 |
+
|
| 17 |
+
- name: Push to hub
|
| 18 |
+
env:
|
| 19 |
+
HF_TOKEN: ${{ secrets.HF_TOKEN }}
|
| 20 |
+
# Replace <USERNAME> and <SPACE_NAME> with your actual HF details
|
| 21 |
+
run: git push --force https://Shouvik99:$HF_TOKEN@huggingface.co/spaces/Shouvik99/LifeGuide main
|
.gitignore
ADDED
|
Binary file (2.08 kB). View file
|
|
|
Dockerfile
ADDED
|
@@ -0,0 +1,18 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Use an official Python runtime as a parent image
|
| 2 |
+
FROM python:3.11-slim
|
| 3 |
+
|
| 4 |
+
# Set the working directory in the container
|
| 5 |
+
WORKDIR /app
|
| 6 |
+
|
| 7 |
+
# Copy the requirements file and install dependencies
|
| 8 |
+
COPY requirements.txt .
|
| 9 |
+
RUN pip install --no-cache-dir -r requirements.txt
|
| 10 |
+
|
| 11 |
+
# Copy the application code
|
| 12 |
+
COPY . .
|
| 13 |
+
|
| 14 |
+
# Run the data ingestion script
|
| 15 |
+
RUN python ingest.py
|
| 16 |
+
|
| 17 |
+
# Run the application in Hugging Face Space
|
| 18 |
+
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860"]
|
README.md
ADDED
|
@@ -0,0 +1,116 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
title: Sacred Texts RAG
|
| 3 |
+
emoji: ποΈ
|
| 4 |
+
colorFrom: gold
|
| 5 |
+
colorTo: white
|
| 6 |
+
sdk: docker
|
| 7 |
+
app_port: 7860
|
| 8 |
+
pinned: false
|
| 9 |
+
---
|
| 10 |
+
|
| 11 |
+
# ποΈ Sacred Texts RAG β Multi-Religion Knowledge Base
|
| 12 |
+
|
| 13 |
+
A Retrieval-Augmented Generation (RAG) application that answers spiritual queries using Bhagavad Gita, Quran, Bible and the Guru Granth Sahib as the sole knowledge sources.
|
| 14 |
+
|
| 15 |
+
---
|
| 16 |
+
|
| 17 |
+
## π Project Structure
|
| 18 |
+
|
| 19 |
+
```
|
| 20 |
+
sacred-texts-rag/
|
| 21 |
+
βββ README.md
|
| 22 |
+
βββ requirements.txt
|
| 23 |
+
βββ .env.example
|
| 24 |
+
βββ ingest.py # Step 1: Load PDFs β chunk β embed β store
|
| 25 |
+
βββ rag_chain.py # Core RAG chain logic
|
| 26 |
+
βββ app.py # FastAPI backend server
|
| 27 |
+
βββ frontend/
|
| 28 |
+
βββ index.html # Chat UI (open in browser)
|
| 29 |
+
```
|
| 30 |
+
|
| 31 |
+
---
|
| 32 |
+
|
| 33 |
+
## βοΈ Setup Instructions
|
| 34 |
+
|
| 35 |
+
### 1. Install Dependencies
|
| 36 |
+
```bash
|
| 37 |
+
pip install -r requirements.txt
|
| 38 |
+
```
|
| 39 |
+
|
| 40 |
+
### 2. Configure Environment
|
| 41 |
+
```bash
|
| 42 |
+
cp .env.example .env
|
| 43 |
+
# Edit .env and add your GEMINI_API_KEY
|
| 44 |
+
```
|
| 45 |
+
|
| 46 |
+
### 3. Add Your PDF Books
|
| 47 |
+
Place your PDF files in a `books/` folder:
|
| 48 |
+
```
|
| 49 |
+
books/
|
| 50 |
+
βββ bhagavad_gita.pdf
|
| 51 |
+
βββ quran.pdf
|
| 52 |
+
βββ bible.pdf
|
| 53 |
+
βββ guru_granth_sahib.pdf
|
| 54 |
+
```
|
| 55 |
+
|
| 56 |
+
### 4. Ingest the Books (Run Once)
|
| 57 |
+
```bash
|
| 58 |
+
python ingest.py
|
| 59 |
+
```
|
| 60 |
+
This will:
|
| 61 |
+
- Load and parse all PDFs
|
| 62 |
+
- Split into semantic chunks
|
| 63 |
+
- Create embeddings using NVIDIA's `llama-nemotron-embed-vl-1b-v2` model
|
| 64 |
+
- Store in a local ChromaDB vector store (`./chroma_db/`)
|
| 65 |
+
|
| 66 |
+
### 5. Start the Backend
|
| 67 |
+
```bash
|
| 68 |
+
python app.py
|
| 69 |
+
```
|
| 70 |
+
Server runs at: `http://localhost:8000`
|
| 71 |
+
|
| 72 |
+
### 6. Open the Frontend
|
| 73 |
+
Open `frontend/index.html` in your browser β no server needed for the UI.
|
| 74 |
+
|
| 75 |
+
---
|
| 76 |
+
|
| 77 |
+
## π Environment Variables
|
| 78 |
+
|
| 79 |
+
| Variable | Description |
|
| 80 |
+
|---|---|
|
| 81 |
+
| `GEMINI_API_KEY` | Your Google Gemini API key |
|
| 82 |
+
| `NVIDIA_API_KEY` | Your NVIDIA API key |
|
| 83 |
+
| `CHROMA_DB_PATH` | Path to ChromaDB storage (default: `./chroma_db`) |
|
| 84 |
+
| `CHUNKS_PER_BOOK` | Number of chunks to retrieve per query (default: `3`) |
|
| 85 |
+
|
| 86 |
+
---
|
| 87 |
+
|
| 88 |
+
## π§ How It Works
|
| 89 |
+
|
| 90 |
+
```
|
| 91 |
+
User Query
|
| 92 |
+
β
|
| 93 |
+
βΌ
|
| 94 |
+
[Embedding Model] βββ NVIDIA llama-nemotron-embed-vl-1b-v2
|
| 95 |
+
β
|
| 96 |
+
βΌ
|
| 97 |
+
[ChromaDB Vector Store] βββ Semantic similarity search
|
| 98 |
+
β (retrieves top-K chunks from Gita, Quran, Bible, and the Guru Granth Sahib)
|
| 99 |
+
β
|
| 100 |
+
βΌ
|
| 101 |
+
[Prompt with Context]
|
| 102 |
+
β
|
| 103 |
+
βΌ
|
| 104 |
+
[Gemini 2.5 Flash Lite] βββ Answer grounded ONLY in retrieved texts
|
| 105 |
+
β
|
| 106 |
+
βΌ
|
| 107 |
+
Response with source citations (book + chapter/verse)
|
| 108 |
+
```
|
| 109 |
+
|
| 110 |
+
---
|
| 111 |
+
|
| 112 |
+
## π Notes
|
| 113 |
+
|
| 114 |
+
- The LLM is instructed **never** to answer from outside the provided texts
|
| 115 |
+
- Each response includes **source citations** (which book the answer came from)
|
| 116 |
+
- Responses synthesize wisdom **across all books** when relevant
|
app.py
ADDED
|
@@ -0,0 +1,133 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
app.py β FastAPI backend server for the Sacred Texts RAG application.
|
| 3 |
+
|
| 4 |
+
Endpoints:
|
| 5 |
+
POST /ask β Ask a question, get an answer with sources
|
| 6 |
+
GET /health β Health check
|
| 7 |
+
GET /books β List books currently in the knowledge base
|
| 8 |
+
|
| 9 |
+
Run with:
|
| 10 |
+
python app.py
|
| 11 |
+
"""
|
| 12 |
+
|
| 13 |
+
import os
|
| 14 |
+
from fastapi import FastAPI, HTTPException
|
| 15 |
+
from fastapi.middleware.cors import CORSMiddleware
|
| 16 |
+
from pydantic import BaseModel, Field
|
| 17 |
+
from dotenv import load_dotenv
|
| 18 |
+
|
| 19 |
+
from rag_chain import query_sacred_texts, get_embeddings, get_vector_store # β FIXED
|
| 20 |
+
|
| 21 |
+
load_dotenv()
|
| 22 |
+
|
| 23 |
+
# βββ App Setup ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 24 |
+
|
| 25 |
+
app = FastAPI(
|
| 26 |
+
title="Sacred Texts RAG API",
|
| 27 |
+
description="Ask questions answered exclusively from Bhagavad Gita, Quran, Bible, and Guru Granth Sahib",
|
| 28 |
+
version="1.0.0",
|
| 29 |
+
)
|
| 30 |
+
|
| 31 |
+
# Allow requests from the local frontend (index.html opened as file://)
|
| 32 |
+
app.add_middleware(
|
| 33 |
+
CORSMiddleware,
|
| 34 |
+
allow_origins=["*"], # Restrict in production
|
| 35 |
+
allow_credentials=True,
|
| 36 |
+
allow_methods=["*"],
|
| 37 |
+
allow_headers=["*"],
|
| 38 |
+
)
|
| 39 |
+
|
| 40 |
+
|
| 41 |
+
# βββ Request / Response Models ββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 42 |
+
|
| 43 |
+
class AskRequest(BaseModel):
|
| 44 |
+
question: str = Field(..., min_length=3, max_length=1000,
|
| 45 |
+
example="What do the scriptures say about compassion?")
|
| 46 |
+
|
| 47 |
+
class Source(BaseModel):
|
| 48 |
+
book: str
|
| 49 |
+
page: int | str
|
| 50 |
+
snippet: str
|
| 51 |
+
|
| 52 |
+
class AskResponse(BaseModel):
|
| 53 |
+
question: str
|
| 54 |
+
answer: str
|
| 55 |
+
sources: list[Source]
|
| 56 |
+
|
| 57 |
+
class HealthResponse(BaseModel):
|
| 58 |
+
status: str
|
| 59 |
+
message: str
|
| 60 |
+
|
| 61 |
+
class BooksResponse(BaseModel):
|
| 62 |
+
books: list[str]
|
| 63 |
+
total_chunks: int
|
| 64 |
+
|
| 65 |
+
|
| 66 |
+
# βββ Routes βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 67 |
+
|
| 68 |
+
@app.get("/health", response_model=HealthResponse, tags=["System"])
|
| 69 |
+
def health_check():
|
| 70 |
+
"""Check that the API is running."""
|
| 71 |
+
return {"status": "ok", "message": "Sacred Texts RAG is running ποΈ"}
|
| 72 |
+
|
| 73 |
+
|
| 74 |
+
@app.get("/books", response_model=BooksResponse, tags=["Knowledge Base"])
|
| 75 |
+
def list_books():
|
| 76 |
+
"""List all books currently indexed in the knowledge base."""
|
| 77 |
+
try:
|
| 78 |
+
embeddings = get_embeddings() # β FIXED Step 1
|
| 79 |
+
vector_store = get_vector_store(embeddings) # β FIXED Step 2
|
| 80 |
+
collection = vector_store._collection
|
| 81 |
+
results = collection.get(include=["metadatas"])
|
| 82 |
+
metadatas = results.get("metadatas", [])
|
| 83 |
+
|
| 84 |
+
books = sorted(set(
|
| 85 |
+
m.get("book", "Unknown")
|
| 86 |
+
for m in metadatas
|
| 87 |
+
if m # guard against None
|
| 88 |
+
))
|
| 89 |
+
return {"books": books, "total_chunks": len(metadatas)}
|
| 90 |
+
except Exception as e:
|
| 91 |
+
raise HTTPException(status_code=500, detail=f"Could not read knowledge base: {e}")
|
| 92 |
+
|
| 93 |
+
|
| 94 |
+
@app.post("/ask", response_model=AskResponse, tags=["Query"])
|
| 95 |
+
def ask(request: AskRequest):
|
| 96 |
+
"""
|
| 97 |
+
Ask a spiritual or philosophical question.
|
| 98 |
+
The answer is grounded strictly in the sacred texts.
|
| 99 |
+
"""
|
| 100 |
+
if not request.question.strip():
|
| 101 |
+
raise HTTPException(status_code=400, detail="Question cannot be empty.")
|
| 102 |
+
|
| 103 |
+
try:
|
| 104 |
+
result = query_sacred_texts(request.question)
|
| 105 |
+
return AskResponse(
|
| 106 |
+
question=request.question,
|
| 107 |
+
answer=result["answer"],
|
| 108 |
+
sources=[Source(**s) for s in result["sources"]],
|
| 109 |
+
)
|
| 110 |
+
except FileNotFoundError:
|
| 111 |
+
raise HTTPException(
|
| 112 |
+
status_code=503,
|
| 113 |
+
detail="Knowledge base not found. Run `python ingest.py` first.",
|
| 114 |
+
)
|
| 115 |
+
except Exception as e:
|
| 116 |
+
raise HTTPException(status_code=500, detail=str(e))
|
| 117 |
+
|
| 118 |
+
|
| 119 |
+
# βββ Entry Point ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 120 |
+
|
| 121 |
+
if __name__ == "__main__":
|
| 122 |
+
import uvicorn
|
| 123 |
+
|
| 124 |
+
host = os.getenv("HOST", "0.0.0.0")
|
| 125 |
+
port = int(os.getenv("PORT", "8000"))
|
| 126 |
+
|
| 127 |
+
print(f"\nποΈ Sacred Texts RAG β API Server")
|
| 128 |
+
print(f"{'β' * 40}")
|
| 129 |
+
print(f"π Running at : http://localhost:{port}")
|
| 130 |
+
print(f"π Docs at : http://localhost:{port}/docs")
|
| 131 |
+
print(f"{'β' * 40}\n")
|
| 132 |
+
|
| 133 |
+
uvicorn.run("app:app", host=host, port=port, reload=True)
|
books/A-Quran-Translation.pdf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:3fa3d5c08e744166f064cbb63663737aa40025c2f582ee37aa3ceffe282aebcd
|
| 3 |
+
size 3894852
|
books/Bhagavad-gita-As-It-Is.pdf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ff112b0b056d303b792f6f2e68cbd73a89adf612fa9113f932446cdea7741583
|
| 3 |
+
size 66135830
|
books/CSB_Pew_Bible_2nd_Printing.pdf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:cb7a72772507690b41ac1b08a36ea355422e1b6561e0438bfeeef73504c53ebd
|
| 3 |
+
size 16634733
|
books/Siri Guru Granth.pdf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b03376ce26b6fc709500dec2a1a4a1bbfdde739716159deeb790892958c97cb6
|
| 3 |
+
size 7066831
|
env.example
ADDED
|
@@ -0,0 +1,18 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# βββ Google Gemini (LLM for answer generation) ββββββββββββββββββββββββββββββββ
|
| 2 |
+
GEMINI_API_KEY=your_gemini_api_key_here
|
| 3 |
+
|
| 4 |
+
# βββ NVIDIA (Embeddings) ββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 5 |
+
NVIDIA_API_KEY=nvapi-your_nvidia_api_key_here
|
| 6 |
+
|
| 7 |
+
# βββ Vector Store βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 8 |
+
CHROMA_DB_PATH=./chroma_db
|
| 9 |
+
COLLECTION_NAME=sacred_texts
|
| 10 |
+
|
| 11 |
+
# βββ Retrieval Settings βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 12 |
+
# Chunks retrieved PER BOOK β every scripture gets this many slots guaranteed
|
| 13 |
+
# Total context = CHUNKS_PER_BOOK x number of books (e.g. 3 x 4 = 12 chunks)
|
| 14 |
+
CHUNKS_PER_BOOK=3
|
| 15 |
+
|
| 16 |
+
# βββ Server βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 17 |
+
HOST=0.0.0.0
|
| 18 |
+
PORT=8000
|
frontend/index.html
ADDED
|
@@ -0,0 +1,626 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
<!DOCTYPE html>
|
| 2 |
+
<html lang="en">
|
| 3 |
+
<head>
|
| 4 |
+
<meta charset="UTF-8" />
|
| 5 |
+
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
|
| 6 |
+
<title>Sacred Texts β Divine Knowledge</title>
|
| 7 |
+
<link rel="preconnect" href="https://fonts.googleapis.com" />
|
| 8 |
+
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin />
|
| 9 |
+
<link href="https://fonts.googleapis.com/css2?family=Cinzel+Decorative:wght@400;700&family=Cormorant+Garamond:ital,wght@0,300;0,400;0,600;1,300;1,400&family=IM+Fell+English:ital@0;1&display=swap" rel="stylesheet" />
|
| 10 |
+
|
| 11 |
+
<style>
|
| 12 |
+
/* ββ Reset & Base βββββββββββββββββββββββββββββββββββββββββββ */
|
| 13 |
+
*, *::before, *::after { box-sizing: border-box; margin: 0; padding: 0; }
|
| 14 |
+
|
| 15 |
+
:root {
|
| 16 |
+
--bg: #0d0b07;
|
| 17 |
+
--surface: #16130d;
|
| 18 |
+
--surface-2: #1e1a11;
|
| 19 |
+
--border: #3a2e1a;
|
| 20 |
+
--gold: #c9993a;
|
| 21 |
+
--gold-light: #e8c170;
|
| 22 |
+
--gold-pale: #f5e4b0;
|
| 23 |
+
--cream: #f0e6cc;
|
| 24 |
+
--muted: #7a6a4a;
|
| 25 |
+
--gita: #e07b3b; /* saffron */
|
| 26 |
+
--quran: #3bba85; /* green */
|
| 27 |
+
--bible: #5b8ce0; /* blue */
|
| 28 |
+
--granth: #b07ce0; /* violet β Sikh royal purple */
|
| 29 |
+
}
|
| 30 |
+
|
| 31 |
+
html, body {
|
| 32 |
+
height: 100%;
|
| 33 |
+
background: var(--bg);
|
| 34 |
+
color: var(--cream);
|
| 35 |
+
font-family: 'Cormorant Garamond', Georgia, serif;
|
| 36 |
+
font-size: 18px;
|
| 37 |
+
line-height: 1.7;
|
| 38 |
+
overflow: hidden;
|
| 39 |
+
}
|
| 40 |
+
|
| 41 |
+
/* ββ Background texture βββββββββββββββββββββββββββββββββββββ */
|
| 42 |
+
body::before {
|
| 43 |
+
content: '';
|
| 44 |
+
position: fixed; inset: 0;
|
| 45 |
+
background:
|
| 46 |
+
radial-gradient(ellipse 80% 60% at 20% 10%, rgba(201,153,58,.07) 0%, transparent 60%),
|
| 47 |
+
radial-gradient(ellipse 60% 80% at 80% 90%, rgba(91,140,224,.05) 0%, transparent 60%),
|
| 48 |
+
radial-gradient(ellipse 50% 50% at 50% 50%, rgba(176,124,224,.04) 0%, transparent 60%),
|
| 49 |
+
url("data:image/svg+xml,%3Csvg xmlns='http://www.w3.org/2000/svg' width='400' height='400'%3E%3Cfilter id='n'%3E%3CfeTurbulence type='fractalNoise' baseFrequency='0.75' numOctaves='4' stitchTiles='stitch'/%3E%3CfeColorMatrix type='saturate' values='0'/%3E%3C/filter%3E%3Crect width='400' height='400' filter='url(%23n)' opacity='0.04'/%3E%3C/svg%3E");
|
| 50 |
+
pointer-events: none;
|
| 51 |
+
z-index: 0;
|
| 52 |
+
}
|
| 53 |
+
|
| 54 |
+
/* ββ Layout βββββββββββββββββββββββββββββββββββββββββββββββββ */
|
| 55 |
+
.app {
|
| 56 |
+
position: relative;
|
| 57 |
+
z-index: 1;
|
| 58 |
+
display: grid;
|
| 59 |
+
grid-template-rows: auto 1fr auto;
|
| 60 |
+
height: 100vh;
|
| 61 |
+
max-width: 860px;
|
| 62 |
+
margin: 0 auto;
|
| 63 |
+
padding: 0 16px;
|
| 64 |
+
}
|
| 65 |
+
|
| 66 |
+
/* ββ Header βββββββββββββββββββββββββββββββββββββββββββββββββ */
|
| 67 |
+
header {
|
| 68 |
+
padding: 28px 0 18px;
|
| 69 |
+
text-align: center;
|
| 70 |
+
border-bottom: 1px solid var(--border);
|
| 71 |
+
}
|
| 72 |
+
|
| 73 |
+
.mandala {
|
| 74 |
+
font-size: 2rem;
|
| 75 |
+
letter-spacing: .5rem;
|
| 76 |
+
color: var(--gold);
|
| 77 |
+
opacity: .6;
|
| 78 |
+
margin-bottom: 8px;
|
| 79 |
+
animation: spin 60s linear infinite;
|
| 80 |
+
display: inline-block;
|
| 81 |
+
}
|
| 82 |
+
@keyframes spin { to { transform: rotate(360deg); } }
|
| 83 |
+
|
| 84 |
+
h1 {
|
| 85 |
+
font-family: 'Cinzel Decorative', serif;
|
| 86 |
+
font-size: clamp(1.2rem, 3vw, 1.9rem);
|
| 87 |
+
font-weight: 400;
|
| 88 |
+
color: var(--gold-pale);
|
| 89 |
+
letter-spacing: .12em;
|
| 90 |
+
text-shadow: 0 0 40px rgba(201,153,58,.3);
|
| 91 |
+
}
|
| 92 |
+
|
| 93 |
+
.subtitle {
|
| 94 |
+
font-family: 'IM Fell English', serif;
|
| 95 |
+
font-style: italic;
|
| 96 |
+
font-size: .95rem;
|
| 97 |
+
color: var(--muted);
|
| 98 |
+
margin-top: 4px;
|
| 99 |
+
}
|
| 100 |
+
|
| 101 |
+
.badges {
|
| 102 |
+
display: flex;
|
| 103 |
+
justify-content: center;
|
| 104 |
+
gap: 12px;
|
| 105 |
+
margin-top: 12px;
|
| 106 |
+
flex-wrap: wrap;
|
| 107 |
+
}
|
| 108 |
+
|
| 109 |
+
.badge {
|
| 110 |
+
font-size: .72rem;
|
| 111 |
+
letter-spacing: .1em;
|
| 112 |
+
text-transform: uppercase;
|
| 113 |
+
padding: 3px 10px;
|
| 114 |
+
border-radius: 20px;
|
| 115 |
+
border: 1px solid;
|
| 116 |
+
font-family: 'Cormorant Garamond', serif;
|
| 117 |
+
font-weight: 600;
|
| 118 |
+
}
|
| 119 |
+
.badge-gita { color: var(--gita); border-color: var(--gita); background: rgba(224,123,59,.1); }
|
| 120 |
+
.badge-quran { color: var(--quran); border-color: var(--quran); background: rgba(59,186,133,.1); }
|
| 121 |
+
.badge-bible { color: var(--bible); border-color: var(--bible); background: rgba(91,140,224,.1); }
|
| 122 |
+
.badge-granth { color: var(--granth); border-color: var(--granth); background: rgba(176,124,224,.1); }
|
| 123 |
+
|
| 124 |
+
/* ββ Chat Window ββββββββββββββββββββββββββββββββββββββββββββ */
|
| 125 |
+
.chat-window {
|
| 126 |
+
overflow-y: auto;
|
| 127 |
+
padding: 28px 0;
|
| 128 |
+
display: flex;
|
| 129 |
+
flex-direction: column;
|
| 130 |
+
gap: 24px;
|
| 131 |
+
scrollbar-width: thin;
|
| 132 |
+
scrollbar-color: var(--border) transparent;
|
| 133 |
+
}
|
| 134 |
+
.chat-window::-webkit-scrollbar { width: 4px; }
|
| 135 |
+
.chat-window::-webkit-scrollbar-thumb { background: var(--border); border-radius: 4px; }
|
| 136 |
+
|
| 137 |
+
/* ββ Welcome State ββββββββββββββββββββββββββββββββββββββββββ */
|
| 138 |
+
.welcome {
|
| 139 |
+
text-align: center;
|
| 140 |
+
margin: auto;
|
| 141 |
+
padding: 20px;
|
| 142 |
+
max-width: 500px;
|
| 143 |
+
}
|
| 144 |
+
|
| 145 |
+
.welcome-icon {
|
| 146 |
+
font-size: 3.5rem;
|
| 147 |
+
margin-bottom: 16px;
|
| 148 |
+
filter: drop-shadow(0 0 20px rgba(201,153,58,.4));
|
| 149 |
+
}
|
| 150 |
+
|
| 151 |
+
.welcome h2 {
|
| 152 |
+
font-family: 'IM Fell English', serif;
|
| 153 |
+
font-style: italic;
|
| 154 |
+
font-size: 1.5rem;
|
| 155 |
+
color: var(--gold-light);
|
| 156 |
+
margin-bottom: 10px;
|
| 157 |
+
}
|
| 158 |
+
|
| 159 |
+
.welcome p {
|
| 160 |
+
font-size: .95rem;
|
| 161 |
+
color: var(--muted);
|
| 162 |
+
line-height: 1.8;
|
| 163 |
+
}
|
| 164 |
+
|
| 165 |
+
.suggested-queries {
|
| 166 |
+
margin-top: 24px;
|
| 167 |
+
display: flex;
|
| 168 |
+
flex-direction: column;
|
| 169 |
+
gap: 8px;
|
| 170 |
+
}
|
| 171 |
+
|
| 172 |
+
.suggested-queries button {
|
| 173 |
+
background: var(--surface);
|
| 174 |
+
border: 1px solid var(--border);
|
| 175 |
+
color: var(--cream);
|
| 176 |
+
padding: 10px 16px;
|
| 177 |
+
border-radius: 8px;
|
| 178 |
+
font-family: 'Cormorant Garamond', serif;
|
| 179 |
+
font-size: .95rem;
|
| 180 |
+
font-style: italic;
|
| 181 |
+
cursor: pointer;
|
| 182 |
+
transition: all .2s;
|
| 183 |
+
text-align: left;
|
| 184 |
+
}
|
| 185 |
+
.suggested-queries button:hover {
|
| 186 |
+
border-color: var(--gold);
|
| 187 |
+
color: var(--gold-pale);
|
| 188 |
+
background: var(--surface-2);
|
| 189 |
+
}
|
| 190 |
+
|
| 191 |
+
/* ββ Messages βββββββββββββββββββββββββββββββββββββββββββββββ */
|
| 192 |
+
.message {
|
| 193 |
+
display: flex;
|
| 194 |
+
flex-direction: column;
|
| 195 |
+
gap: 8px;
|
| 196 |
+
animation: fadeUp .4s ease both;
|
| 197 |
+
}
|
| 198 |
+
@keyframes fadeUp {
|
| 199 |
+
from { opacity: 0; transform: translateY(12px); }
|
| 200 |
+
to { opacity: 1; transform: translateY(0); }
|
| 201 |
+
}
|
| 202 |
+
|
| 203 |
+
.message-user {
|
| 204 |
+
align-items: flex-end;
|
| 205 |
+
}
|
| 206 |
+
.message-assistant {
|
| 207 |
+
align-items: flex-start;
|
| 208 |
+
}
|
| 209 |
+
|
| 210 |
+
.msg-label {
|
| 211 |
+
font-size: .7rem;
|
| 212 |
+
letter-spacing: .15em;
|
| 213 |
+
text-transform: uppercase;
|
| 214 |
+
color: var(--muted);
|
| 215 |
+
font-weight: 600;
|
| 216 |
+
padding: 0 4px;
|
| 217 |
+
}
|
| 218 |
+
|
| 219 |
+
.msg-bubble {
|
| 220 |
+
max-width: 92%;
|
| 221 |
+
padding: 16px 20px;
|
| 222 |
+
border-radius: 12px;
|
| 223 |
+
line-height: 1.75;
|
| 224 |
+
}
|
| 225 |
+
|
| 226 |
+
.message-user .msg-bubble {
|
| 227 |
+
background: var(--surface-2);
|
| 228 |
+
border: 1px solid var(--border);
|
| 229 |
+
color: var(--cream);
|
| 230 |
+
font-style: italic;
|
| 231 |
+
font-size: 1rem;
|
| 232 |
+
border-bottom-right-radius: 4px;
|
| 233 |
+
}
|
| 234 |
+
|
| 235 |
+
.message-assistant .msg-bubble {
|
| 236 |
+
background: linear-gradient(135deg, var(--surface) 0%, rgba(30,26,17,.95) 100%);
|
| 237 |
+
border: 1px solid rgba(201,153,58,.2);
|
| 238 |
+
color: var(--cream);
|
| 239 |
+
font-size: 1rem;
|
| 240 |
+
border-bottom-left-radius: 4px;
|
| 241 |
+
box-shadow: 0 4px 24px rgba(0,0,0,.4), inset 0 1px 0 rgba(201,153,58,.1);
|
| 242 |
+
}
|
| 243 |
+
|
| 244 |
+
.msg-bubble p { margin-bottom: 1em; }
|
| 245 |
+
.msg-bubble p:last-child { margin-bottom: 0; }
|
| 246 |
+
.msg-bubble strong { color: var(--gold-light); font-weight: 600; }
|
| 247 |
+
|
| 248 |
+
/* ββ Sources Panel ββββββββββββββββββββββββββββββββββββββββββ */
|
| 249 |
+
.sources {
|
| 250 |
+
max-width: 92%;
|
| 251 |
+
margin-top: 4px;
|
| 252 |
+
}
|
| 253 |
+
|
| 254 |
+
.sources-label {
|
| 255 |
+
font-size: .72rem;
|
| 256 |
+
letter-spacing: .12em;
|
| 257 |
+
text-transform: uppercase;
|
| 258 |
+
color: var(--muted);
|
| 259 |
+
margin-bottom: 6px;
|
| 260 |
+
display: flex;
|
| 261 |
+
align-items: center;
|
| 262 |
+
gap: 6px;
|
| 263 |
+
}
|
| 264 |
+
.sources-label::before, .sources-label::after {
|
| 265 |
+
content: '';
|
| 266 |
+
flex: 1;
|
| 267 |
+
height: 1px;
|
| 268 |
+
background: var(--border);
|
| 269 |
+
}
|
| 270 |
+
.sources-label::before { max-width: 20px; }
|
| 271 |
+
|
| 272 |
+
.source-tags {
|
| 273 |
+
display: flex;
|
| 274 |
+
flex-wrap: wrap;
|
| 275 |
+
gap: 6px;
|
| 276 |
+
}
|
| 277 |
+
|
| 278 |
+
.source-tag {
|
| 279 |
+
font-size: .78rem;
|
| 280 |
+
padding: 4px 10px;
|
| 281 |
+
border-radius: 6px;
|
| 282 |
+
border: 1px solid;
|
| 283 |
+
font-family: 'Cormorant Garamond', serif;
|
| 284 |
+
cursor: default;
|
| 285 |
+
transition: all .2s;
|
| 286 |
+
}
|
| 287 |
+
.source-tag:hover { transform: translateY(-1px); filter: brightness(1.2); }
|
| 288 |
+
.source-gita { color: var(--gita); border-color: rgba(224,123,59,.4); background: rgba(224,123,59,.08); }
|
| 289 |
+
.source-quran { color: var(--quran); border-color: rgba(59,186,133,.4); background: rgba(59,186,133,.08); }
|
| 290 |
+
.source-bible { color: var(--bible); border-color: rgba(91,140,224,.4); background: rgba(91,140,224,.08); }
|
| 291 |
+
.source-granth { color: var(--granth); border-color: rgba(176,124,224,.4); background: rgba(176,124,224,.08); }
|
| 292 |
+
.source-other { color: var(--gold-light); border-color: rgba(201,153,58,.4); background: rgba(201,153,58,.08); }
|
| 293 |
+
|
| 294 |
+
/* ββ Loading ββββββββββββββββββββββββββββββββββββββββββββββββ */
|
| 295 |
+
.loading {
|
| 296 |
+
display: flex;
|
| 297 |
+
align-items: center;
|
| 298 |
+
gap: 12px;
|
| 299 |
+
padding: 14px 18px;
|
| 300 |
+
border: 1px solid rgba(201,153,58,.15);
|
| 301 |
+
border-radius: 12px;
|
| 302 |
+
background: var(--surface);
|
| 303 |
+
width: fit-content;
|
| 304 |
+
max-width: 280px;
|
| 305 |
+
}
|
| 306 |
+
|
| 307 |
+
.loading-dots {
|
| 308 |
+
display: flex;
|
| 309 |
+
gap: 5px;
|
| 310 |
+
}
|
| 311 |
+
.loading-dots span {
|
| 312 |
+
width: 6px; height: 6px;
|
| 313 |
+
border-radius: 50%;
|
| 314 |
+
background: var(--gold);
|
| 315 |
+
animation: dot-pulse 1.4s ease-in-out infinite;
|
| 316 |
+
}
|
| 317 |
+
.loading-dots span:nth-child(2) { animation-delay: .2s; }
|
| 318 |
+
.loading-dots span:nth-child(3) { animation-delay: .4s; }
|
| 319 |
+
@keyframes dot-pulse {
|
| 320 |
+
0%, 80%, 100% { opacity: .2; transform: scale(.8); }
|
| 321 |
+
40% { opacity: 1; transform: scale(1.1); }
|
| 322 |
+
}
|
| 323 |
+
|
| 324 |
+
.loading-text {
|
| 325 |
+
font-size: .85rem;
|
| 326 |
+
font-style: italic;
|
| 327 |
+
color: var(--muted);
|
| 328 |
+
}
|
| 329 |
+
|
| 330 |
+
/* ββ Error ββββββββββββββββββββββββββββββββββββββββββββββββββ */
|
| 331 |
+
.error-bubble {
|
| 332 |
+
background: rgba(180, 60, 60, .1);
|
| 333 |
+
border: 1px solid rgba(180, 60, 60, .3);
|
| 334 |
+
color: #e08080;
|
| 335 |
+
padding: 12px 16px;
|
| 336 |
+
border-radius: 10px;
|
| 337 |
+
font-size: .9rem;
|
| 338 |
+
max-width: 92%;
|
| 339 |
+
}
|
| 340 |
+
|
| 341 |
+
/* ββ Input Area βββββββββββββββββββββββββββββββββββββββββββββ */
|
| 342 |
+
.input-area {
|
| 343 |
+
padding: 16px 0 24px;
|
| 344 |
+
border-top: 1px solid var(--border);
|
| 345 |
+
}
|
| 346 |
+
|
| 347 |
+
.input-row {
|
| 348 |
+
display: flex;
|
| 349 |
+
gap: 10px;
|
| 350 |
+
align-items: flex-end;
|
| 351 |
+
}
|
| 352 |
+
|
| 353 |
+
textarea {
|
| 354 |
+
flex: 1;
|
| 355 |
+
background: var(--surface);
|
| 356 |
+
border: 1px solid var(--border);
|
| 357 |
+
color: var(--cream);
|
| 358 |
+
padding: 14px 16px;
|
| 359 |
+
border-radius: 12px;
|
| 360 |
+
font-family: 'Cormorant Garamond', serif;
|
| 361 |
+
font-size: 1rem;
|
| 362 |
+
line-height: 1.6;
|
| 363 |
+
resize: none;
|
| 364 |
+
min-height: 52px;
|
| 365 |
+
max-height: 140px;
|
| 366 |
+
outline: none;
|
| 367 |
+
transition: border-color .2s, box-shadow .2s;
|
| 368 |
+
}
|
| 369 |
+
textarea::placeholder { color: var(--muted); font-style: italic; }
|
| 370 |
+
textarea:focus {
|
| 371 |
+
border-color: rgba(201,153,58,.5);
|
| 372 |
+
box-shadow: 0 0 0 3px rgba(201,153,58,.08);
|
| 373 |
+
}
|
| 374 |
+
|
| 375 |
+
.send-btn {
|
| 376 |
+
width: 52px; height: 52px;
|
| 377 |
+
border-radius: 12px;
|
| 378 |
+
border: 1px solid rgba(201,153,58,.4);
|
| 379 |
+
background: linear-gradient(135deg, rgba(201,153,58,.2), rgba(201,153,58,.05));
|
| 380 |
+
color: var(--gold);
|
| 381 |
+
font-size: 1.3rem;
|
| 382 |
+
cursor: pointer;
|
| 383 |
+
transition: all .2s;
|
| 384 |
+
display: flex;
|
| 385 |
+
align-items: center;
|
| 386 |
+
justify-content: center;
|
| 387 |
+
flex-shrink: 0;
|
| 388 |
+
}
|
| 389 |
+
.send-btn:hover:not(:disabled) {
|
| 390 |
+
background: linear-gradient(135deg, rgba(201,153,58,.35), rgba(201,153,58,.15));
|
| 391 |
+
border-color: var(--gold);
|
| 392 |
+
transform: translateY(-1px);
|
| 393 |
+
box-shadow: 0 4px 16px rgba(201,153,58,.2);
|
| 394 |
+
}
|
| 395 |
+
.send-btn:disabled { opacity: .3; cursor: not-allowed; transform: none; }
|
| 396 |
+
|
| 397 |
+
.input-hint {
|
| 398 |
+
font-size: .72rem;
|
| 399 |
+
color: var(--muted);
|
| 400 |
+
margin-top: 8px;
|
| 401 |
+
text-align: center;
|
| 402 |
+
font-style: italic;
|
| 403 |
+
}
|
| 404 |
+
|
| 405 |
+
/* ββ Divider line βββββββββββββββββββββββββββββββββββββββββββ */
|
| 406 |
+
.ornament {
|
| 407 |
+
text-align: center;
|
| 408 |
+
color: var(--border);
|
| 409 |
+
font-size: .8rem;
|
| 410 |
+
letter-spacing: .4em;
|
| 411 |
+
margin: 4px 0;
|
| 412 |
+
}
|
| 413 |
+
</style>
|
| 414 |
+
</head>
|
| 415 |
+
<body>
|
| 416 |
+
<div class="app">
|
| 417 |
+
|
| 418 |
+
<!-- Header -->
|
| 419 |
+
<header>
|
| 420 |
+
<div class="mandala">β¦</div>
|
| 421 |
+
<h1>Life Guide</h1>
|
| 422 |
+
<p class="subtitle">Wisdom from the Bhagavad Gita, Quran, Bible & Guru Granth Sahib</p>
|
| 423 |
+
<div class="badges">
|
| 424 |
+
<span class="badge badge-gita">Bhagavad Gita</span>
|
| 425 |
+
<span class="badge badge-quran">Quran</span>
|
| 426 |
+
<span class="badge badge-bible">Bible</span>
|
| 427 |
+
<span class="badge badge-granth">Guru Granth Sahib</span>
|
| 428 |
+
</div>
|
| 429 |
+
</header>
|
| 430 |
+
|
| 431 |
+
<!-- Chat Window -->
|
| 432 |
+
<div class="chat-window" id="chatWindow">
|
| 433 |
+
<div class="welcome" id="welcomePane">
|
| 434 |
+
<div class="welcome-icon">ποΈ</div>
|
| 435 |
+
<h2>"Seek, and it shall be given unto you"</h2>
|
| 436 |
+
<p>Ask any spiritual or philosophical question. Answers are drawn exclusively from the Bhagavad Gita, Quran, Bible, and Guru Granth Sahib.</p>
|
| 437 |
+
<div class="suggested-queries">
|
| 438 |
+
<button onclick="askSuggested(this)">What do the scriptures say about forgiveness?</button>
|
| 439 |
+
<button onclick="askSuggested(this)">How should one face fear and death?</button>
|
| 440 |
+
<button onclick="askSuggested(this)">What is the purpose of prayer and worship?</button>
|
| 441 |
+
<button onclick="askSuggested(this)">What is the nature of the soul according to each religion?</button>
|
| 442 |
+
<button onclick="askSuggested(this)">What do the scriptures teach about humility and selfless service?</button>
|
| 443 |
+
</div>
|
| 444 |
+
</div>
|
| 445 |
+
</div>
|
| 446 |
+
|
| 447 |
+
<!-- Input -->
|
| 448 |
+
<div class="input-area">
|
| 449 |
+
<div class="input-row">
|
| 450 |
+
<textarea
|
| 451 |
+
id="questionInput"
|
| 452 |
+
placeholder="Ask a question from the sacred textsβ¦"
|
| 453 |
+
rows="1"
|
| 454 |
+
onkeydown="handleKey(event)"
|
| 455 |
+
oninput="autoResize(this)"
|
| 456 |
+
></textarea>
|
| 457 |
+
<button class="send-btn" id="sendBtn" onclick="sendQuestion()" title="Ask (Enter)">
|
| 458 |
+
β¦
|
| 459 |
+
</button>
|
| 460 |
+
</div>
|
| 461 |
+
<p class="input-hint">Press Enter to ask Β· Shift+Enter for new line Β· Answers grounded strictly in the sacred texts</p>
|
| 462 |
+
</div>
|
| 463 |
+
|
| 464 |
+
</div>
|
| 465 |
+
|
| 466 |
+
<script>
|
| 467 |
+
const API_BASE = "http://localhost:8000";
|
| 468 |
+
let isLoading = false;
|
| 469 |
+
|
| 470 |
+
// ββ Helpers ββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 471 |
+
function getSourceClass(book) {
|
| 472 |
+
const b = book.toLowerCase();
|
| 473 |
+
if (b.includes("gita")) return "source-gita";
|
| 474 |
+
if (b.includes("quran") || b.includes("koran")) return "source-quran";
|
| 475 |
+
if (b.includes("bible") || b.includes("testament")) return "source-bible";
|
| 476 |
+
if (b.includes("granth") || b.includes("guru")) return "source-granth";
|
| 477 |
+
return "source-other";
|
| 478 |
+
}
|
| 479 |
+
|
| 480 |
+
function hideWelcome() {
|
| 481 |
+
const w = document.getElementById("welcomePane");
|
| 482 |
+
if (w) w.remove();
|
| 483 |
+
}
|
| 484 |
+
|
| 485 |
+
function scrollToBottom() {
|
| 486 |
+
const w = document.getElementById("chatWindow");
|
| 487 |
+
w.scrollTop = w.scrollHeight;
|
| 488 |
+
}
|
| 489 |
+
|
| 490 |
+
function autoResize(el) {
|
| 491 |
+
el.style.height = "auto";
|
| 492 |
+
el.style.height = Math.min(el.scrollHeight, 140) + "px";
|
| 493 |
+
}
|
| 494 |
+
|
| 495 |
+
function formatAnswer(text) {
|
| 496 |
+
// Convert markdown-ish bold (**text**) to <strong>
|
| 497 |
+
text = text.replace(/\*\*(.*?)\*\*/g, "<strong>$1</strong>");
|
| 498 |
+
// Wrap paragraphs
|
| 499 |
+
return text.split(/\n\n+/).filter(p => p.trim()).map(p => `<p>${p.trim()}</p>`).join("");
|
| 500 |
+
}
|
| 501 |
+
|
| 502 |
+
// ββ Append message to chat βββββββββββββββββββββββββββββββββ
|
| 503 |
+
function appendUserMessage(question) {
|
| 504 |
+
const w = document.getElementById("chatWindow");
|
| 505 |
+
const div = document.createElement("div");
|
| 506 |
+
div.className = "message message-user";
|
| 507 |
+
div.innerHTML = `
|
| 508 |
+
<span class="msg-label">You</span>
|
| 509 |
+
<div class="msg-bubble">${escapeHtml(question)}</div>
|
| 510 |
+
`;
|
| 511 |
+
w.appendChild(div);
|
| 512 |
+
scrollToBottom();
|
| 513 |
+
}
|
| 514 |
+
|
| 515 |
+
function appendLoading() {
|
| 516 |
+
const w = document.getElementById("chatWindow");
|
| 517 |
+
const div = document.createElement("div");
|
| 518 |
+
div.className = "message message-assistant";
|
| 519 |
+
div.id = "loadingMsg";
|
| 520 |
+
div.innerHTML = `
|
| 521 |
+
<span class="msg-label">Sacred Texts</span>
|
| 522 |
+
<div class="loading">
|
| 523 |
+
<div class="loading-dots"><span></span><span></span><span></span></div>
|
| 524 |
+
<span class="loading-text">Consulting the scripturesβ¦</span>
|
| 525 |
+
</div>
|
| 526 |
+
`;
|
| 527 |
+
w.appendChild(div);
|
| 528 |
+
scrollToBottom();
|
| 529 |
+
return div;
|
| 530 |
+
}
|
| 531 |
+
|
| 532 |
+
function replaceLoadingWithAnswer(loadingEl, data) {
|
| 533 |
+
const w = document.getElementById("chatWindow");
|
| 534 |
+
|
| 535 |
+
// Build source tags
|
| 536 |
+
const sourceTags = (data.sources || []).map(s => {
|
| 537 |
+
const cls = getSourceClass(s.book);
|
| 538 |
+
return `<span class="source-tag ${cls}" title="Page ${s.page}">π ${s.book}</span>`;
|
| 539 |
+
}).join("");
|
| 540 |
+
|
| 541 |
+
const sourcesHtml = sourceTags ? `
|
| 542 |
+
<div class="sources">
|
| 543 |
+
<div class="sources-label">References</div>
|
| 544 |
+
<div class="source-tags">${sourceTags}</div>
|
| 545 |
+
</div>
|
| 546 |
+
` : "";
|
| 547 |
+
|
| 548 |
+
loadingEl.innerHTML = `
|
| 549 |
+
<span class="msg-label">Sacred Texts</span>
|
| 550 |
+
<div class="msg-bubble">${formatAnswer(data.answer)}</div>
|
| 551 |
+
${sourcesHtml}
|
| 552 |
+
`;
|
| 553 |
+
scrollToBottom();
|
| 554 |
+
}
|
| 555 |
+
|
| 556 |
+
function replaceLoadingWithError(loadingEl, msg) {
|
| 557 |
+
loadingEl.innerHTML = `
|
| 558 |
+
<span class="msg-label">Error</span>
|
| 559 |
+
<div class="error-bubble">β οΈ ${escapeHtml(msg)}</div>
|
| 560 |
+
`;
|
| 561 |
+
scrollToBottom();
|
| 562 |
+
}
|
| 563 |
+
|
| 564 |
+
function escapeHtml(str) {
|
| 565 |
+
return str.replace(/&/g,"&").replace(/</g,"<").replace(/>/g,">");
|
| 566 |
+
}
|
| 567 |
+
|
| 568 |
+
// ββ Send question ββββββββββββββββββββββββββββββββββββββββββ
|
| 569 |
+
async function sendQuestion() {
|
| 570 |
+
if (isLoading) return;
|
| 571 |
+
const input = document.getElementById("questionInput");
|
| 572 |
+
const question = input.value.trim();
|
| 573 |
+
if (!question) return;
|
| 574 |
+
|
| 575 |
+
hideWelcome();
|
| 576 |
+
isLoading = true;
|
| 577 |
+
document.getElementById("sendBtn").disabled = true;
|
| 578 |
+
input.value = "";
|
| 579 |
+
input.style.height = "auto";
|
| 580 |
+
|
| 581 |
+
appendUserMessage(question);
|
| 582 |
+
const loadingEl = appendLoading();
|
| 583 |
+
|
| 584 |
+
try {
|
| 585 |
+
const res = await fetch(`${API_BASE}/ask`, {
|
| 586 |
+
method: "POST",
|
| 587 |
+
headers: { "Content-Type": "application/json" },
|
| 588 |
+
body: JSON.stringify({ question }),
|
| 589 |
+
});
|
| 590 |
+
|
| 591 |
+
if (!res.ok) {
|
| 592 |
+
const err = await res.json().catch(() => ({ detail: res.statusText }));
|
| 593 |
+
throw new Error(err.detail || "Server error");
|
| 594 |
+
}
|
| 595 |
+
|
| 596 |
+
const data = await res.json();
|
| 597 |
+
replaceLoadingWithAnswer(loadingEl, data);
|
| 598 |
+
} catch (err) {
|
| 599 |
+
let msg = err.message;
|
| 600 |
+
if (msg.includes("fetch") || msg.includes("NetworkError") || msg.includes("Failed")) {
|
| 601 |
+
msg = "Cannot reach the server. Make sure `python app.py` is running on localhost:8000.";
|
| 602 |
+
}
|
| 603 |
+
replaceLoadingWithError(loadingEl, msg);
|
| 604 |
+
} finally {
|
| 605 |
+
isLoading = false;
|
| 606 |
+
document.getElementById("sendBtn").disabled = false;
|
| 607 |
+
input.focus();
|
| 608 |
+
}
|
| 609 |
+
}
|
| 610 |
+
|
| 611 |
+
function askSuggested(btn) {
|
| 612 |
+
const input = document.getElementById("questionInput");
|
| 613 |
+
input.value = btn.textContent;
|
| 614 |
+
autoResize(input);
|
| 615 |
+
sendQuestion();
|
| 616 |
+
}
|
| 617 |
+
|
| 618 |
+
function handleKey(e) {
|
| 619 |
+
if (e.key === "Enter" && !e.shiftKey) {
|
| 620 |
+
e.preventDefault();
|
| 621 |
+
sendQuestion();
|
| 622 |
+
}
|
| 623 |
+
}
|
| 624 |
+
</script>
|
| 625 |
+
</body>
|
| 626 |
+
</html>
|
ingest.py
ADDED
|
@@ -0,0 +1,179 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
ingest.py β Step 1: Build the vector knowledge base from religious PDFs.
|
| 3 |
+
|
| 4 |
+
Run this ONCE before starting the app:
|
| 5 |
+
python ingest.py
|
| 6 |
+
|
| 7 |
+
It will:
|
| 8 |
+
1. Load all PDFs from the ./books/ directory
|
| 9 |
+
2. Split them into overlapping semantic chunks
|
| 10 |
+
3. Embed each chunk using NVIDIA's llama-nemotron embedding model
|
| 11 |
+
4. Persist everything into a local ChromaDB vector store
|
| 12 |
+
"""
|
| 13 |
+
|
| 14 |
+
import os
|
| 15 |
+
import sys
|
| 16 |
+
from pathlib import Path
|
| 17 |
+
from dotenv import load_dotenv
|
| 18 |
+
|
| 19 |
+
from langchain_community.document_loaders import PyPDFLoader, PyMuPDFLoader
|
| 20 |
+
from langchain_text_splitters import RecursiveCharacterTextSplitter
|
| 21 |
+
from langchain_nvidia_ai_endpoints import NVIDIAEmbeddings
|
| 22 |
+
from langchain_chroma import Chroma
|
| 23 |
+
|
| 24 |
+
load_dotenv()
|
| 25 |
+
|
| 26 |
+
# βββ Configuration ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 27 |
+
|
| 28 |
+
BOOKS_DIR = Path("./books")
|
| 29 |
+
CHROMA_DB_PATH = os.getenv("CHROMA_DB_PATH", "./chroma_db")
|
| 30 |
+
COLLECTION_NAME = os.getenv("COLLECTION_NAME", "sacred_texts")
|
| 31 |
+
NVIDIA_API_KEY = os.getenv("NVIDIA_API_KEY")
|
| 32 |
+
|
| 33 |
+
# Mapping of filename keywords β friendly book name stored in metadata
|
| 34 |
+
BOOK_NAME_MAP = {
|
| 35 |
+
"gita": "Bhagavad Gita",
|
| 36 |
+
"bhagavad": "Bhagavad Gita",
|
| 37 |
+
"quran": "Quran",
|
| 38 |
+
"koran": "Quran",
|
| 39 |
+
"bible": "Bible",
|
| 40 |
+
"testament": "Bible",
|
| 41 |
+
"granth": "Guru Granth Sahib", # β ADD
|
| 42 |
+
"guru": "Guru Granth Sahib", # β ADD
|
| 43 |
+
}
|
| 44 |
+
# Chunk settings β tuned for religious texts (verses are short)
|
| 45 |
+
CHUNK_SIZE = 800 # characters per chunk
|
| 46 |
+
CHUNK_OVERLAP = 150 # overlap to preserve verse context across boundaries
|
| 47 |
+
|
| 48 |
+
|
| 49 |
+
# βββ Helpers ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 50 |
+
|
| 51 |
+
def detect_book_name(filename: str) -> str:
|
| 52 |
+
"""Infer the book's display name from its filename."""
|
| 53 |
+
name_lower = filename.lower()
|
| 54 |
+
for keyword, book_name in BOOK_NAME_MAP.items():
|
| 55 |
+
if keyword in name_lower:
|
| 56 |
+
return book_name
|
| 57 |
+
# Fallback: use the filename stem, title-cased
|
| 58 |
+
return Path(filename).stem.replace("_", " ").title()
|
| 59 |
+
|
| 60 |
+
|
| 61 |
+
def load_pdf(pdf_path: Path) -> list:
|
| 62 |
+
"""
|
| 63 |
+
Load a PDF using PyMuPDF (preferred) or PyPDF as fallback.
|
| 64 |
+
Returns a list of LangChain Document objects.
|
| 65 |
+
"""
|
| 66 |
+
try:
|
| 67 |
+
loader = PyMuPDFLoader(str(pdf_path))
|
| 68 |
+
print(f" π Loading with PyMuPDF: {pdf_path.name}")
|
| 69 |
+
except Exception:
|
| 70 |
+
loader = PyPDFLoader(str(pdf_path))
|
| 71 |
+
print(f" π Loading with PyPDF: {pdf_path.name}")
|
| 72 |
+
|
| 73 |
+
docs = loader.load()
|
| 74 |
+
print(f" β {len(docs)} pages loaded")
|
| 75 |
+
return docs
|
| 76 |
+
|
| 77 |
+
|
| 78 |
+
def tag_documents(docs: list, book_name: str, source_file: str) -> list:
|
| 79 |
+
"""
|
| 80 |
+
Enrich each document's metadata with:
|
| 81 |
+
- book: display name (e.g. "Bhagavad Gita")
|
| 82 |
+
- source_file: original filename
|
| 83 |
+
"""
|
| 84 |
+
for doc in docs:
|
| 85 |
+
doc.metadata["book"] = book_name
|
| 86 |
+
doc.metadata["source_file"] = source_file
|
| 87 |
+
# Keep the page number if already present from the loader
|
| 88 |
+
if "page" not in doc.metadata:
|
| 89 |
+
doc.metadata["page"] = 0
|
| 90 |
+
return docs
|
| 91 |
+
|
| 92 |
+
|
| 93 |
+
# βββ Main Ingestion βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 94 |
+
|
| 95 |
+
def ingest():
|
| 96 |
+
if not NVIDIA_API_KEY:
|
| 97 |
+
print("β NVIDIA_API_KEY not set. Add it to your .env file.")
|
| 98 |
+
sys.exit(1)
|
| 99 |
+
|
| 100 |
+
if not BOOKS_DIR.exists():
|
| 101 |
+
print(f"β Books directory not found: {BOOKS_DIR.resolve()}")
|
| 102 |
+
print(" Create a ./books/ folder and add your PDFs there.")
|
| 103 |
+
sys.exit(1)
|
| 104 |
+
|
| 105 |
+
pdf_files = list(BOOKS_DIR.glob("*.pdf"))
|
| 106 |
+
if not pdf_files:
|
| 107 |
+
print(f"β No PDF files found in {BOOKS_DIR.resolve()}")
|
| 108 |
+
sys.exit(1)
|
| 109 |
+
|
| 110 |
+
print(f"\nποΈ Sacred Texts RAG β Ingestion Pipeline")
|
| 111 |
+
print(f"{'β' * 50}")
|
| 112 |
+
print(f"π Books directory : {BOOKS_DIR.resolve()}")
|
| 113 |
+
print(f"πΎ ChromaDB path : {Path(CHROMA_DB_PATH).resolve()}")
|
| 114 |
+
print(f"π PDFs found : {len(pdf_files)}")
|
| 115 |
+
print(f"{'β' * 50}\n")
|
| 116 |
+
|
| 117 |
+
# ββ Step 1: Load all PDFs ββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 118 |
+
all_docs = []
|
| 119 |
+
for pdf_path in pdf_files:
|
| 120 |
+
book_name = detect_book_name(pdf_path.name)
|
| 121 |
+
print(f"π {book_name}")
|
| 122 |
+
raw_docs = load_pdf(pdf_path)
|
| 123 |
+
tagged_docs = tag_documents(raw_docs, book_name, pdf_path.name)
|
| 124 |
+
all_docs.extend(tagged_docs)
|
| 125 |
+
print(f" β
Tagged as '{book_name}'\n")
|
| 126 |
+
|
| 127 |
+
print(f"π Total pages loaded: {len(all_docs)}")
|
| 128 |
+
|
| 129 |
+
# ββ Step 2: Split into chunks ββββββββββββββββββββββββββββββββββββββββββββ
|
| 130 |
+
print(f"\nβοΈ Splitting into chunks (size={CHUNK_SIZE}, overlap={CHUNK_OVERLAP})...")
|
| 131 |
+
splitter = RecursiveCharacterTextSplitter(
|
| 132 |
+
chunk_size=CHUNK_SIZE,
|
| 133 |
+
chunk_overlap=CHUNK_OVERLAP,
|
| 134 |
+
separators=["\n\n", "\n", ". ", " ", ""], # Respect paragraph/verse boundaries
|
| 135 |
+
)
|
| 136 |
+
chunks = splitter.split_documents(all_docs)
|
| 137 |
+
print(f" β {len(chunks)} chunks created")
|
| 138 |
+
|
| 139 |
+
# ββ Step 3: Embed & store ββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 140 |
+
print(f"\nπ’ Initialising NVIDIA embedding model (llama-nemotron-embed-vl-1b-v2)...")
|
| 141 |
+
embeddings = NVIDIAEmbeddings(
|
| 142 |
+
model="nvidia/llama-nemotron-embed-vl-1b-v2",
|
| 143 |
+
api_key=NVIDIA_API_KEY,
|
| 144 |
+
truncate="NONE",
|
| 145 |
+
)
|
| 146 |
+
|
| 147 |
+
print(f"πΎ Building ChromaDB vector store β this may take a few minutes...")
|
| 148 |
+
print(f" (Embedding {len(chunks)} chunks...)\n")
|
| 149 |
+
|
| 150 |
+
# Process in batches to avoid rate limits
|
| 151 |
+
BATCH_SIZE = 100
|
| 152 |
+
vector_store = None
|
| 153 |
+
|
| 154 |
+
for i in range(0, len(chunks), BATCH_SIZE):
|
| 155 |
+
batch = chunks[i : i + BATCH_SIZE]
|
| 156 |
+
batch_num = i // BATCH_SIZE + 1
|
| 157 |
+
total_batches = (len(chunks) + BATCH_SIZE - 1) // BATCH_SIZE
|
| 158 |
+
print(f" Batch {batch_num}/{total_batches}: embedding {len(batch)} chunks...")
|
| 159 |
+
|
| 160 |
+
if vector_store is None:
|
| 161 |
+
vector_store = Chroma.from_documents(
|
| 162 |
+
documents=batch,
|
| 163 |
+
embedding=embeddings,
|
| 164 |
+
persist_directory=CHROMA_DB_PATH,
|
| 165 |
+
collection_name=COLLECTION_NAME,
|
| 166 |
+
)
|
| 167 |
+
else:
|
| 168 |
+
vector_store.add_documents(batch)
|
| 169 |
+
|
| 170 |
+
print(f"\n{'β' * 50}")
|
| 171 |
+
print(f"β
Ingestion complete!")
|
| 172 |
+
print(f" π¦ {len(chunks)} chunks stored in ChromaDB")
|
| 173 |
+
print(f" π Location: {Path(CHROMA_DB_PATH).resolve()}")
|
| 174 |
+
print(f"\nπ Now run: python app.py")
|
| 175 |
+
print(f"{'β' * 50}\n")
|
| 176 |
+
|
| 177 |
+
|
| 178 |
+
if __name__ == "__main__":
|
| 179 |
+
ingest()
|
rag_chain.py
ADDED
|
@@ -0,0 +1,240 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
rag_chain.py β Core RAG chain using LangChain + Gemini.
|
| 3 |
+
|
| 4 |
+
KEY FIX: Uses per-book retrieval (guaranteed slots per scripture) instead of
|
| 5 |
+
a single similarity search β so no book gets starved from the context window
|
| 6 |
+
when the query is semantically closer to another book's language.
|
| 7 |
+
|
| 8 |
+
This module exposes a single function:
|
| 9 |
+
answer = query_sacred_texts(user_question)
|
| 10 |
+
|
| 11 |
+
Returns a dict with:
|
| 12 |
+
{
|
| 13 |
+
"answer": "...",
|
| 14 |
+
"sources": [
|
| 15 |
+
{"book": "Bhagavad Gita", "page": 42, "snippet": "..."},
|
| 16 |
+
...
|
| 17 |
+
]
|
| 18 |
+
}
|
| 19 |
+
"""
|
| 20 |
+
|
| 21 |
+
import os
|
| 22 |
+
from dotenv import load_dotenv
|
| 23 |
+
from langchain_nvidia_ai_endpoints import NVIDIAEmbeddings
|
| 24 |
+
from langchain_google_genai import ChatGoogleGenerativeAI
|
| 25 |
+
from langchain_chroma import Chroma
|
| 26 |
+
from langchain_core.prompts import ChatPromptTemplate
|
| 27 |
+
from langchain_core.output_parsers import StrOutputParser
|
| 28 |
+
load_dotenv()
|
| 29 |
+
|
| 30 |
+
GEMINI_API_KEY = os.getenv("GEMINI_API_KEY")
|
| 31 |
+
NVIDIA_API_KEY = os.getenv("NVIDIA_API_KEY")
|
| 32 |
+
CHROMA_DB_PATH = os.getenv("CHROMA_DB_PATH", "./chroma_db")
|
| 33 |
+
COLLECTION_NAME = os.getenv("COLLECTION_NAME", "sacred_texts")
|
| 34 |
+
|
| 35 |
+
# Chunks retrieved PER BOOK β guarantees every scripture contributes to the answer
|
| 36 |
+
CHUNKS_PER_BOOK = int(os.getenv("CHUNKS_PER_BOOK", "3"))
|
| 37 |
+
|
| 38 |
+
# All books currently in the knowledge base β add new books here as you ingest them
|
| 39 |
+
KNOWN_BOOKS = [
|
| 40 |
+
"Bhagavad Gita",
|
| 41 |
+
"Quran",
|
| 42 |
+
"Bible",
|
| 43 |
+
"Guru Granth Sahib",
|
| 44 |
+
]
|
| 45 |
+
|
| 46 |
+
|
| 47 |
+
# βββ System Prompt ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 48 |
+
|
| 49 |
+
SYSTEM_PROMPT = """You are a scholarly and compassionate guide to sacred scriptures.
|
| 50 |
+
You have deep knowledge of the Bhagavad Gita, the Quran, the Bible, and the Guru Granth Sahib.
|
| 51 |
+
|
| 52 |
+
STRICT RULES you must ALWAYS follow:
|
| 53 |
+
1. Answer ONLY using the provided context passages. Do NOT use any external knowledge.
|
| 54 |
+
2. If a specific book's passages are provided but not relevant to the question, skip that book.
|
| 55 |
+
3. If NONE of the context is relevant, say: "The provided texts do not directly address this question."
|
| 56 |
+
4. Always cite which book(s) your answer draws from.
|
| 57 |
+
5. When the question asks to COMPARE books (e.g. "what do Quran and Gita say"), you MUST
|
| 58 |
+
address EACH of those books separately, then synthesise the common thread.
|
| 59 |
+
6. Be respectful and neutral toward all faiths β treat each text with equal reverence.
|
| 60 |
+
7. Do NOT speculate, invent verses, or add information beyond the context.
|
| 61 |
+
|
| 62 |
+
FORMAT your response as:
|
| 63 |
+
- A clear, thoughtful answer (2β4 paragraphs)
|
| 64 |
+
- A "π Sources" section listing each book referenced with the key insight drawn from it
|
| 65 |
+
|
| 66 |
+
Context passages from the sacred texts (guaranteed passages from each book):
|
| 67 |
+
ββββββββββββββββββββββββββββββββββββββββ
|
| 68 |
+
{context}
|
| 69 |
+
ββββββββββββββββββββββββββββββββββββββββ
|
| 70 |
+
"""
|
| 71 |
+
|
| 72 |
+
HUMAN_PROMPT = "Question: {question}"
|
| 73 |
+
|
| 74 |
+
|
| 75 |
+
# βββ Embeddings & Vector Store ββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 76 |
+
|
| 77 |
+
def get_embeddings():
|
| 78 |
+
return NVIDIAEmbeddings(
|
| 79 |
+
model="nvidia/llama-nemotron-embed-vl-1b-v2",
|
| 80 |
+
api_key=NVIDIA_API_KEY,
|
| 81 |
+
truncate="NONE",
|
| 82 |
+
)
|
| 83 |
+
|
| 84 |
+
|
| 85 |
+
def get_vector_store(embeddings):
|
| 86 |
+
return Chroma(
|
| 87 |
+
persist_directory=CHROMA_DB_PATH,
|
| 88 |
+
embedding_function=embeddings,
|
| 89 |
+
collection_name=COLLECTION_NAME,
|
| 90 |
+
)
|
| 91 |
+
|
| 92 |
+
|
| 93 |
+
# βββ Per-Book Retrieval βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 94 |
+
|
| 95 |
+
def retrieve_per_book(question: str, vector_store: Chroma) -> list:
|
| 96 |
+
"""
|
| 97 |
+
Retrieve CHUNKS_PER_BOOK chunks from EACH known book independently,
|
| 98 |
+
using a metadata filter. This guarantees every scripture is represented
|
| 99 |
+
in the context β no book can be crowded out by higher-scoring chunks
|
| 100 |
+
from another book.
|
| 101 |
+
"""
|
| 102 |
+
all_docs = []
|
| 103 |
+
for book in KNOWN_BOOKS:
|
| 104 |
+
try:
|
| 105 |
+
results = vector_store.similarity_search(
|
| 106 |
+
query=question,
|
| 107 |
+
k=CHUNKS_PER_BOOK,
|
| 108 |
+
filter={"book": book}, # β metadata filter: only this book
|
| 109 |
+
)
|
| 110 |
+
if results:
|
| 111 |
+
print(f" π {book}: {len(results)} chunk(s) retrieved")
|
| 112 |
+
else:
|
| 113 |
+
print(f" β οΈ {book}: 0 chunks found (not ingested?)")
|
| 114 |
+
all_docs.extend(results)
|
| 115 |
+
except Exception as e:
|
| 116 |
+
print(f" β {book}: retrieval error β {e}")
|
| 117 |
+
|
| 118 |
+
return all_docs
|
| 119 |
+
|
| 120 |
+
|
| 121 |
+
# βββ Format Retrieved Docs ββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 122 |
+
|
| 123 |
+
def format_docs(docs: list) -> str:
|
| 124 |
+
"""
|
| 125 |
+
Format retrieved documents grouped by book for clarity.
|
| 126 |
+
Each chunk is labelled with book and page number.
|
| 127 |
+
"""
|
| 128 |
+
# Group by book to keep context readable
|
| 129 |
+
by_book: dict[str, list] = {}
|
| 130 |
+
for doc in docs:
|
| 131 |
+
book = doc.metadata.get("book", "Unknown")
|
| 132 |
+
by_book.setdefault(book, []).append(doc)
|
| 133 |
+
|
| 134 |
+
sections = []
|
| 135 |
+
for book, book_docs in by_book.items():
|
| 136 |
+
header = f"βββ {book} βββ"
|
| 137 |
+
chunks = []
|
| 138 |
+
for i, doc in enumerate(book_docs, 1):
|
| 139 |
+
page = doc.metadata.get("page", "?")
|
| 140 |
+
chunks.append(f" [{i}] (Page {page}): {doc.page_content.strip()}")
|
| 141 |
+
sections.append(header + "\n" + "\n\n".join(chunks))
|
| 142 |
+
|
| 143 |
+
return "\n\n".join(sections)
|
| 144 |
+
|
| 145 |
+
|
| 146 |
+
# βββ Build the RAG Chain ββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 147 |
+
|
| 148 |
+
def build_chain():
|
| 149 |
+
"""Build and return the LLM chain and vector store."""
|
| 150 |
+
embeddings = get_embeddings()
|
| 151 |
+
vector_store = get_vector_store(embeddings)
|
| 152 |
+
|
| 153 |
+
llm = ChatGoogleGenerativeAI(
|
| 154 |
+
model="gemini-2.5-flash-lite",
|
| 155 |
+
google_api_key=GEMINI_API_KEY,
|
| 156 |
+
temperature=0.2,
|
| 157 |
+
max_output_tokens=1500,
|
| 158 |
+
)
|
| 159 |
+
|
| 160 |
+
prompt = ChatPromptTemplate.from_messages([
|
| 161 |
+
("system", SYSTEM_PROMPT),
|
| 162 |
+
("human", HUMAN_PROMPT),
|
| 163 |
+
])
|
| 164 |
+
|
| 165 |
+
# Chain: prompt β LLM β string output
|
| 166 |
+
# (retrieval is handled manually in query_sacred_texts for per-book control)
|
| 167 |
+
llm_chain = prompt | llm | StrOutputParser()
|
| 168 |
+
|
| 169 |
+
return llm_chain, vector_store
|
| 170 |
+
|
| 171 |
+
|
| 172 |
+
# βββ Public API βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 173 |
+
|
| 174 |
+
_llm_chain = None
|
| 175 |
+
_vector_store = None
|
| 176 |
+
|
| 177 |
+
|
| 178 |
+
def query_sacred_texts(question: str) -> dict:
|
| 179 |
+
"""
|
| 180 |
+
Query the sacred texts knowledge base with guaranteed per-book retrieval.
|
| 181 |
+
|
| 182 |
+
Args:
|
| 183 |
+
question: The user's spiritual/philosophical question.
|
| 184 |
+
|
| 185 |
+
Returns:
|
| 186 |
+
{
|
| 187 |
+
"answer": str,
|
| 188 |
+
"sources": list[dict] # [{book, page, snippet}, ...]
|
| 189 |
+
}
|
| 190 |
+
"""
|
| 191 |
+
global _llm_chain, _vector_store
|
| 192 |
+
|
| 193 |
+
if _llm_chain is None:
|
| 194 |
+
print("π§ Initialising RAG chain (first call)...")
|
| 195 |
+
_llm_chain, _vector_store = build_chain()
|
| 196 |
+
|
| 197 |
+
# Step 1: Retrieve per-book (guaranteed slots for every scripture)
|
| 198 |
+
print(f"\nπ Retrieving {CHUNKS_PER_BOOK} chunks per book for: '{question}'")
|
| 199 |
+
source_docs = retrieve_per_book(question, _vector_store)
|
| 200 |
+
|
| 201 |
+
if not source_docs:
|
| 202 |
+
return {
|
| 203 |
+
"answer": "No content found in the knowledge base. Please run ingest.py first.",
|
| 204 |
+
"sources": [],
|
| 205 |
+
}
|
| 206 |
+
|
| 207 |
+
# Step 2: Format context grouped by book
|
| 208 |
+
context = format_docs(source_docs)
|
| 209 |
+
|
| 210 |
+
# Step 3: Generate answer
|
| 211 |
+
answer = _llm_chain.invoke({"context": context, "question": question})
|
| 212 |
+
|
| 213 |
+
# Step 4: Build deduplicated source list for the UI
|
| 214 |
+
seen_books = set()
|
| 215 |
+
sources = []
|
| 216 |
+
for doc in source_docs:
|
| 217 |
+
book = doc.metadata.get("book", "Unknown")
|
| 218 |
+
page = doc.metadata.get("page", "?")
|
| 219 |
+
snippet = doc.page_content[:200].strip() + "..."
|
| 220 |
+
if book not in seen_books:
|
| 221 |
+
seen_books.add(book)
|
| 222 |
+
sources.append({"book": book, "page": page, "snippet": snippet})
|
| 223 |
+
|
| 224 |
+
return {
|
| 225 |
+
"answer": answer,
|
| 226 |
+
"sources": sources,
|
| 227 |
+
}
|
| 228 |
+
|
| 229 |
+
|
| 230 |
+
# βββ Quick CLI Test βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 231 |
+
|
| 232 |
+
if __name__ == "__main__":
|
| 233 |
+
test_q = "In what aspects do the Quran and Gita teach the same thing?"
|
| 234 |
+
print(f"\nπ Test query: {test_q}\n")
|
| 235 |
+
result = query_sacred_texts(test_q)
|
| 236 |
+
print("π Answer:\n")
|
| 237 |
+
print(result["answer"])
|
| 238 |
+
print("\nπ Sources retrieved:")
|
| 239 |
+
for s in result["sources"]:
|
| 240 |
+
print(f" - {s['book']} (page {s['page']})")
|
requirements.txt
ADDED
|
@@ -0,0 +1,26 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Core LangChain
|
| 2 |
+
langchain
|
| 3 |
+
langchain-google-genai
|
| 4 |
+
langchain-community
|
| 5 |
+
langchain-chroma
|
| 6 |
+
langchain-nvidia-ai-endpoints
|
| 7 |
+
langchain-text-splitters
|
| 8 |
+
|
| 9 |
+
# Vector Store
|
| 10 |
+
chromadb
|
| 11 |
+
|
| 12 |
+
# PDF Loading
|
| 13 |
+
pypdf
|
| 14 |
+
pymupdf # Better PDF parsing (optional but recommended)
|
| 15 |
+
|
| 16 |
+
# Google Gemini
|
| 17 |
+
google-generativeai
|
| 18 |
+
|
| 19 |
+
# API Server
|
| 20 |
+
fastapi
|
| 21 |
+
uvicorn[standard]
|
| 22 |
+
python-multipart
|
| 23 |
+
|
| 24 |
+
# Utilities
|
| 25 |
+
python-dotenv
|
| 26 |
+
pydantic
|