Spaces:

Shouvik99
/

LifeGuide

Sleeping

App Files Files Community

Shouvik599 commited on 26 days ago

Commit

22fd41f

0 Parent(s):

Initial clean commit

Browse files

Files changed (16) hide show

.dockerignore +9 -0
.gitattributes +1 -0
.github/workflows/main.yml +21 -0
.gitignore +0 -0
Dockerfile +18 -0
README.md +116 -0
app.py +133 -0
books/A-Quran-Translation.pdf +3 -0
books/Bhagavad-gita-As-It-Is.pdf +3 -0
books/CSB_Pew_Bible_2nd_Printing.pdf +3 -0
books/Siri Guru Granth.pdf +3 -0
env.example +18 -0
frontend/index.html +626 -0
ingest.py +179 -0
rag_chain.py +240 -0
requirements.txt +26 -0

.dockerignore ADDED Viewed

	@@ -0,0 +1,9 @@

+.git
+.gitignore
+.venv
+__pycache__
+chroma_db
+*.md
+env.example
+.dockerignore
+Dockerfile

.gitattributes ADDED Viewed

	@@ -0,0 +1 @@


1	+ books/*.pdf filter=lfs diff=lfs merge=lfs -text

.github/workflows/main.yml ADDED Viewed

	@@ -0,0 +1,21 @@

+name: Sync to Hugging Face hub
+on:
+  push:
+    branches: [main]
+  # Allows you to run this workflow manually from the Actions tab
+  workflow_dispatch:
+jobs:
+  sync-to-hub:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v3
+        with:
+          fetch-depth: 0
+          lfs: true
+      - name: Push to hub
+        env:
+          HF_TOKEN: ${{ secrets.HF_TOKEN }}
+        # Replace <USERNAME> and <SPACE_NAME> with your actual HF details
+        run: git push --force https://Shouvik99:$HF_TOKEN@huggingface.co/spaces/Shouvik99/LifeGuide main

.gitignore ADDED Viewed

Binary file (2.08 kB). View file

Dockerfile ADDED Viewed

	@@ -0,0 +1,18 @@

+# Use an official Python runtime as a parent image
+FROM python:3.11-slim
+# Set the working directory in the container
+WORKDIR /app
+# Copy the requirements file and install dependencies
+COPY requirements.txt .
+RUN pip install --no-cache-dir -r requirements.txt
+# Copy the application code
+COPY . .
+# Run the data ingestion script
+RUN python ingest.py
+# Run the application in Hugging Face Space
+CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860"]

README.md ADDED Viewed

	@@ -0,0 +1,116 @@

+---
+title: Sacred Texts RAG
+emoji: 🕊️
+colorFrom: gold
+colorTo: white
+sdk: docker
+app_port: 7860
+pinned: false
+---
+# 🕊️ Sacred Texts RAG — Multi-Religion Knowledge Base
+A Retrieval-Augmented Generation (RAG) application that answers spiritual queries using Bhagavad Gita, Quran, Bible and the Guru Granth Sahib as the sole knowledge sources.
+---
+## 📁 Project Structure
+```
+sacred-texts-rag/
+├── README.md
+├── requirements.txt
+├── .env.example
+├── ingest.py               # Step 1: Load PDFs → chunk → embed → store
+├── rag_chain.py            # Core RAG chain logic
+├── app.py                  # FastAPI backend server
+└── frontend/
+    └── index.html          # Chat UI (open in browser)
+```
+---
+## ⚙️ Setup Instructions
+### 1. Install Dependencies
+```bash
+pip install -r requirements.txt
+```
+### 2. Configure Environment
+```bash
+cp .env.example .env
+# Edit .env and add your GEMINI_API_KEY
+```
+### 3. Add Your PDF Books
+Place your PDF files in a `books/` folder:
+```
+books/
+├── bhagavad_gita.pdf
+├── quran.pdf
+└── bible.pdf
+└── guru_granth_sahib.pdf
+```
+### 4. Ingest the Books (Run Once)
+```bash
+python ingest.py
+```
+This will:
+- Load and parse all PDFs
+- Split into semantic chunks
+- Create embeddings using NVIDIA's `llama-nemotron-embed-vl-1b-v2` model
+- Store in a local ChromaDB vector store (`./chroma_db/`)
+### 5. Start the Backend
+```bash
+python app.py
+```
+Server runs at: `http://localhost:8000`
+### 6. Open the Frontend
+Open `frontend/index.html` in your browser — no server needed for the UI.
+---
+## 🔑 Environment Variables
+| Variable | Description |
+|---|---|
+| `GEMINI_API_KEY` | Your Google Gemini API key |
+| `NVIDIA_API_KEY` | Your NVIDIA API key |
+| `CHROMA_DB_PATH` | Path to ChromaDB storage (default: `./chroma_db`) |
+| `CHUNKS_PER_BOOK` | Number of chunks to retrieve per query (default: `3`) |
+---
+## 🧠 How It Works
+```
+User Query
+    │
+    ▼
+[Embedding Model]  ←── NVIDIA llama-nemotron-embed-vl-1b-v2
+    │
+    ▼
+[ChromaDB Vector Store]  ←── Semantic similarity search
+    │  (retrieves top-K chunks from Gita, Quran, Bible, and the Guru Granth Sahib)
+    │
+    ▼
+[Prompt with Context]
+    │
+    ▼
+[Gemini 2.5 Flash Lite]  ←── Answer grounded ONLY in retrieved texts
+    │
+    ▼
+Response with source citations (book + chapter/verse)
+```
+---
+## 📝 Notes
+- The LLM is instructed **never** to answer from outside the provided texts
+- Each response includes **source citations** (which book the answer came from)
+- Responses synthesize wisdom **across all books** when relevant

app.py ADDED Viewed

	@@ -0,0 +1,133 @@

+"""
+app.py — FastAPI backend server for the Sacred Texts RAG application.
+Endpoints:
+    POST /ask          — Ask a question, get an answer with sources
+    GET  /health       — Health check
+    GET  /books        — List books currently in the knowledge base
+Run with:
+    python app.py
+"""
+import os
+from fastapi import FastAPI, HTTPException
+from fastapi.middleware.cors import CORSMiddleware
+from pydantic import BaseModel, Field
+from dotenv import load_dotenv
+from rag_chain import query_sacred_texts, get_embeddings, get_vector_store  # ← FIXED
+load_dotenv()
+# ─── App Setup ────────────────────────────────────────────────────────────────
+app = FastAPI(
+    title="Sacred Texts RAG API",
+    description="Ask questions answered exclusively from Bhagavad Gita, Quran, Bible, and Guru Granth Sahib",
+    version="1.0.0",
+)
+# Allow requests from the local frontend (index.html opened as file://)
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=["*"],           # Restrict in production
+    allow_credentials=True,
+    allow_methods=["*"],
+    allow_headers=["*"],
+)
+# ─── Request / Response Models ────────────────────────────────────────────────
+class AskRequest(BaseModel):
+    question: str = Field(..., min_length=3, max_length=1000,
+                          example="What do the scriptures say about compassion?")
+class Source(BaseModel):
+    book: str
+    page: int | str
+    snippet: str
+class AskResponse(BaseModel):
+    question: str
+    answer: str
+    sources: list[Source]
+class HealthResponse(BaseModel):
+    status: str
+    message: str
+class BooksResponse(BaseModel):
+    books: list[str]
+    total_chunks: int
+# ─── Routes ───────────────────────────────────────────────────────────────────
+@app.get("/health", response_model=HealthResponse, tags=["System"])
+def health_check():
+    """Check that the API is running."""
+    return {"status": "ok", "message": "Sacred Texts RAG is running 🕊️"}
+@app.get("/books", response_model=BooksResponse, tags=["Knowledge Base"])
+def list_books():
+    """List all books currently indexed in the knowledge base."""
+    try:
+        embeddings = get_embeddings()               # ← FIXED Step 1
+        vector_store = get_vector_store(embeddings) # ← FIXED Step 2
+        collection = vector_store._collection
+        results = collection.get(include=["metadatas"])
+        metadatas = results.get("metadatas", [])
+        books = sorted(set(
+            m.get("book", "Unknown")
+            for m in metadatas
+            if m  # guard against None
+        ))
+        return {"books": books, "total_chunks": len(metadatas)}
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=f"Could not read knowledge base: {e}")
+@app.post("/ask", response_model=AskResponse, tags=["Query"])
+def ask(request: AskRequest):
+    """
+    Ask a spiritual or philosophical question.
+    The answer is grounded strictly in the sacred texts.
+    """
+    if not request.question.strip():
+        raise HTTPException(status_code=400, detail="Question cannot be empty.")
+    try:
+        result = query_sacred_texts(request.question)
+        return AskResponse(
+            question=request.question,
+            answer=result["answer"],
+            sources=[Source(**s) for s in result["sources"]],
+        )
+    except FileNotFoundError:
+        raise HTTPException(
+            status_code=503,
+            detail="Knowledge base not found. Run `python ingest.py` first.",
+        )
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=str(e))
+# ─── Entry Point ──────────────────────────────────────────────────────────────
+if __name__ == "__main__":
+    import uvicorn
+    host = os.getenv("HOST", "0.0.0.0")
+    port = int(os.getenv("PORT", "8000"))
+    print(f"\n🕊️  Sacred Texts RAG — API Server")
+    print(f"{'─' * 40}")
+    print(f"🌐  Running at : http://localhost:{port}")
+    print(f"📖  Docs at    : http://localhost:{port}/docs")
+    print(f"{'─' * 40}\n")
+    uvicorn.run("app:app", host=host, port=port, reload=True)

books/A-Quran-Translation.pdf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3fa3d5c08e744166f064cbb63663737aa40025c2f582ee37aa3ceffe282aebcd
+size 3894852

books/Bhagavad-gita-As-It-Is.pdf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ff112b0b056d303b792f6f2e68cbd73a89adf612fa9113f932446cdea7741583
+size 66135830

books/CSB_Pew_Bible_2nd_Printing.pdf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:cb7a72772507690b41ac1b08a36ea355422e1b6561e0438bfeeef73504c53ebd
+size 16634733

books/Siri Guru Granth.pdf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b03376ce26b6fc709500dec2a1a4a1bbfdde739716159deeb790892958c97cb6
+size 7066831

env.example ADDED Viewed

	@@ -0,0 +1,18 @@

+# ─── Google Gemini (LLM for answer generation) ────────────────────────────────
+GEMINI_API_KEY=your_gemini_api_key_here
+# ─── NVIDIA (Embeddings) ──────────────────────────────────────────────────────
+NVIDIA_API_KEY=nvapi-your_nvidia_api_key_here
+# ─── Vector Store ─────────────────────────────────────────────────────────────
+CHROMA_DB_PATH=./chroma_db
+COLLECTION_NAME=sacred_texts
+# ─── Retrieval Settings ───────────────────────────────────────────────────────
+# Chunks retrieved PER BOOK — every scripture gets this many slots guaranteed
+# Total context = CHUNKS_PER_BOOK x number of books (e.g. 3 x 4 = 12 chunks)
+CHUNKS_PER_BOOK=3
+# ─── Server ───────────────────────────────────────────────────────────────────
+HOST=0.0.0.0
+PORT=8000

frontend/index.html ADDED Viewed

	@@ -0,0 +1,626 @@

+<!DOCTYPE html>
+<html lang="en">
+<head>
+  <meta charset="UTF-8" />
+  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
+  <title>Sacred Texts — Divine Knowledge</title>
+  <link rel="preconnect" href="https://fonts.googleapis.com" />
+  <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin />
+  <link href="https://fonts.googleapis.com/css2?family=Cinzel+Decorative:wght@400;700&family=Cormorant+Garamond:ital,wght@0,300;0,400;0,600;1,300;1,400&family=IM+Fell+English:ital@0;1&display=swap" rel="stylesheet" />
+  <style>
+    /* ── Reset & Base ─────────────────────────────────────────── */
+    *, *::before, *::after { box-sizing: border-box; margin: 0; padding: 0; }
+    :root {
+      --bg:          #0d0b07;
+      --surface:     #16130d;
+      --surface-2:   #1e1a11;
+      --border:      #3a2e1a;
+      --gold:        #c9993a;
+      --gold-light:  #e8c170;
+      --gold-pale:   #f5e4b0;
+      --cream:       #f0e6cc;
+      --muted:       #7a6a4a;
+      --gita:        #e07b3b;    /* saffron */
+      --quran:       #3bba85;    /* green */
+      --bible:       #5b8ce0;    /* blue */
+      --granth:      #b07ce0;    /* violet — Sikh royal purple */
+    }
+    html, body {
+      height: 100%;
+      background: var(--bg);
+      color: var(--cream);
+      font-family: 'Cormorant Garamond', Georgia, serif;
+      font-size: 18px;
+      line-height: 1.7;
+      overflow: hidden;
+    }
+    /* ── Background texture ───────────────────────────────────── */
+    body::before {
+      content: '';
+      position: fixed; inset: 0;
+      background:
+        radial-gradient(ellipse 80% 60% at 20% 10%, rgba(201,153,58,.07) 0%, transparent 60%),
+        radial-gradient(ellipse 60% 80% at 80% 90%, rgba(91,140,224,.05) 0%, transparent 60%),
+        radial-gradient(ellipse 50% 50% at 50% 50%, rgba(176,124,224,.04) 0%, transparent 60%),
+        url("data:image/svg+xml,%3Csvg xmlns='http://www.w3.org/2000/svg' width='400' height='400'%3E%3Cfilter id='n'%3E%3CfeTurbulence type='fractalNoise' baseFrequency='0.75' numOctaves='4' stitchTiles='stitch'/%3E%3CfeColorMatrix type='saturate' values='0'/%3E%3C/filter%3E%3Crect width='400' height='400' filter='url(%23n)' opacity='0.04'/%3E%3C/svg%3E");
+      pointer-events: none;
+      z-index: 0;
+    }
+    /* ── Layout ───────────────────────────────────────────────── */
+    .app {
+      position: relative;
+      z-index: 1;
+      display: grid;
+      grid-template-rows: auto 1fr auto;
+      height: 100vh;
+      max-width: 860px;
+      margin: 0 auto;
+      padding: 0 16px;
+    }
+    /* ── Header ───────────────────────────────────────────────── */
+    header {
+      padding: 28px 0 18px;
+      text-align: center;
+      border-bottom: 1px solid var(--border);
+    }
+    .mandala {
+      font-size: 2rem;
+      letter-spacing: .5rem;
+      color: var(--gold);
+      opacity: .6;
+      margin-bottom: 8px;
+      animation: spin 60s linear infinite;
+      display: inline-block;
+    }
+    @keyframes spin { to { transform: rotate(360deg); } }
+    h1 {
+      font-family: 'Cinzel Decorative', serif;
+      font-size: clamp(1.2rem, 3vw, 1.9rem);
+      font-weight: 400;
+      color: var(--gold-pale);
+      letter-spacing: .12em;
+      text-shadow: 0 0 40px rgba(201,153,58,.3);
+    }
+    .subtitle {
+      font-family: 'IM Fell English', serif;
+      font-style: italic;
+      font-size: .95rem;
+      color: var(--muted);
+      margin-top: 4px;
+    }
+    .badges {
+      display: flex;
+      justify-content: center;
+      gap: 12px;
+      margin-top: 12px;
+      flex-wrap: wrap;
+    }
+    .badge {
+      font-size: .72rem;
+      letter-spacing: .1em;
+      text-transform: uppercase;
+      padding: 3px 10px;
+      border-radius: 20px;
+      border: 1px solid;
+      font-family: 'Cormorant Garamond', serif;
+      font-weight: 600;
+    }
+    .badge-gita   { color: var(--gita);   border-color: var(--gita);   background: rgba(224,123,59,.1); }
+    .badge-quran  { color: var(--quran);  border-color: var(--quran);  background: rgba(59,186,133,.1); }
+    .badge-bible  { color: var(--bible);  border-color: var(--bible);  background: rgba(91,140,224,.1); }
+    .badge-granth { color: var(--granth); border-color: var(--granth); background: rgba(176,124,224,.1); }
+    /* ── Chat Window ──────────────────────────────────────────── */
+    .chat-window {
+      overflow-y: auto;
+      padding: 28px 0;
+      display: flex;
+      flex-direction: column;
+      gap: 24px;
+      scrollbar-width: thin;
+      scrollbar-color: var(--border) transparent;
+    }
+    .chat-window::-webkit-scrollbar { width: 4px; }
+    .chat-window::-webkit-scrollbar-thumb { background: var(--border); border-radius: 4px; }
+    /* ── Welcome State ────────────────────────────────────────── */
+    .welcome {
+      text-align: center;
+      margin: auto;
+      padding: 20px;
+      max-width: 500px;
+    }
+    .welcome-icon {
+      font-size: 3.5rem;
+      margin-bottom: 16px;
+      filter: drop-shadow(0 0 20px rgba(201,153,58,.4));
+    }
+    .welcome h2 {
+      font-family: 'IM Fell English', serif;
+      font-style: italic;
+      font-size: 1.5rem;
+      color: var(--gold-light);
+      margin-bottom: 10px;
+    }
+    .welcome p {
+      font-size: .95rem;
+      color: var(--muted);
+      line-height: 1.8;
+    }
+    .suggested-queries {
+      margin-top: 24px;
+      display: flex;
+      flex-direction: column;
+      gap: 8px;
+    }
+    .suggested-queries button {
+      background: var(--surface);
+      border: 1px solid var(--border);
+      color: var(--cream);
+      padding: 10px 16px;
+      border-radius: 8px;
+      font-family: 'Cormorant Garamond', serif;
+      font-size: .95rem;
+      font-style: italic;
+      cursor: pointer;
+      transition: all .2s;
+      text-align: left;
+    }
+    .suggested-queries button:hover {
+      border-color: var(--gold);
+      color: var(--gold-pale);
+      background: var(--surface-2);
+    }
+    /* ── Messages ─────────────────────────────────────────────── */
+    .message {
+      display: flex;
+      flex-direction: column;
+      gap: 8px;
+      animation: fadeUp .4s ease both;
+    }
+    @keyframes fadeUp {
+      from { opacity: 0; transform: translateY(12px); }
+      to   { opacity: 1; transform: translateY(0); }
+    }
+    .message-user {
+      align-items: flex-end;
+    }
+    .message-assistant {
+      align-items: flex-start;
+    }
+    .msg-label {
+      font-size: .7rem;
+      letter-spacing: .15em;
+      text-transform: uppercase;
+      color: var(--muted);
+      font-weight: 600;
+      padding: 0 4px;
+    }
+    .msg-bubble {
+      max-width: 92%;
+      padding: 16px 20px;
+      border-radius: 12px;
+      line-height: 1.75;
+    }
+    .message-user .msg-bubble {
+      background: var(--surface-2);
+      border: 1px solid var(--border);
+      color: var(--cream);
+      font-style: italic;
+      font-size: 1rem;
+      border-bottom-right-radius: 4px;
+    }
+    .message-assistant .msg-bubble {
+      background: linear-gradient(135deg, var(--surface) 0%, rgba(30,26,17,.95) 100%);
+      border: 1px solid rgba(201,153,58,.2);
+      color: var(--cream);
+      font-size: 1rem;
+      border-bottom-left-radius: 4px;
+      box-shadow: 0 4px 24px rgba(0,0,0,.4), inset 0 1px 0 rgba(201,153,58,.1);
+    }
+    .msg-bubble p { margin-bottom: 1em; }
+    .msg-bubble p:last-child { margin-bottom: 0; }
+    .msg-bubble strong { color: var(--gold-light); font-weight: 600; }
+    /* ── Sources Panel ────────────────────────────────────────── */
+    .sources {
+      max-width: 92%;
+      margin-top: 4px;
+    }
+    .sources-label {
+      font-size: .72rem;
+      letter-spacing: .12em;
+      text-transform: uppercase;
+      color: var(--muted);
+      margin-bottom: 6px;
+      display: flex;
+      align-items: center;
+      gap: 6px;
+    }
+    .sources-label::before, .sources-label::after {
+      content: '';
+      flex: 1;
+      height: 1px;
+      background: var(--border);
+    }
+    .sources-label::before { max-width: 20px; }
+    .source-tags {
+      display: flex;
+      flex-wrap: wrap;
+      gap: 6px;
+    }
+    .source-tag {
+      font-size: .78rem;
+      padding: 4px 10px;
+      border-radius: 6px;
+      border: 1px solid;
+      font-family: 'Cormorant Garamond', serif;
+      cursor: default;
+      transition: all .2s;
+    }
+    .source-tag:hover { transform: translateY(-1px); filter: brightness(1.2); }
+    .source-gita   { color: var(--gita);   border-color: rgba(224,123,59,.4);  background: rgba(224,123,59,.08); }
+    .source-quran  { color: var(--quran);  border-color: rgba(59,186,133,.4);  background: rgba(59,186,133,.08); }
+    .source-bible  { color: var(--bible);  border-color: rgba(91,140,224,.4);  background: rgba(91,140,224,.08); }
+    .source-granth { color: var(--granth); border-color: rgba(176,124,224,.4); background: rgba(176,124,224,.08); }
+    .source-other  { color: var(--gold-light); border-color: rgba(201,153,58,.4); background: rgba(201,153,58,.08); }
+    /* ── Loading ──────────────────────────────────────────────── */
+    .loading {
+      display: flex;
+      align-items: center;
+      gap: 12px;
+      padding: 14px 18px;
+      border: 1px solid rgba(201,153,58,.15);
+      border-radius: 12px;
+      background: var(--surface);
+      width: fit-content;
+      max-width: 280px;
+    }
+    .loading-dots {
+      display: flex;
+      gap: 5px;
+    }
+    .loading-dots span {
+      width: 6px; height: 6px;
+      border-radius: 50%;
+      background: var(--gold);
+      animation: dot-pulse 1.4s ease-in-out infinite;
+    }
+    .loading-dots span:nth-child(2) { animation-delay: .2s; }
+    .loading-dots span:nth-child(3) { animation-delay: .4s; }
+    @keyframes dot-pulse {
+      0%, 80%, 100% { opacity: .2; transform: scale(.8); }
+      40%            { opacity: 1;  transform: scale(1.1); }
+    }
+    .loading-text {
+      font-size: .85rem;
+      font-style: italic;
+      color: var(--muted);
+    }
+    /* ── Error ────────────────────────────────────────────────── */
+    .error-bubble {
+      background: rgba(180, 60, 60, .1);
+      border: 1px solid rgba(180, 60, 60, .3);
+      color: #e08080;
+      padding: 12px 16px;
+      border-radius: 10px;
+      font-size: .9rem;
+      max-width: 92%;
+    }
+    /* ── Input Area ───────────────────────────────────────────── */
+    .input-area {
+      padding: 16px 0 24px;
+      border-top: 1px solid var(--border);
+    }
+    .input-row {
+      display: flex;
+      gap: 10px;
+      align-items: flex-end;
+    }
+    textarea {
+      flex: 1;
+      background: var(--surface);
+      border: 1px solid var(--border);
+      color: var(--cream);
+      padding: 14px 16px;
+      border-radius: 12px;
+      font-family: 'Cormorant Garamond', serif;
+      font-size: 1rem;
+      line-height: 1.6;
+      resize: none;
+      min-height: 52px;
+      max-height: 140px;
+      outline: none;
+      transition: border-color .2s, box-shadow .2s;
+    }
+    textarea::placeholder { color: var(--muted); font-style: italic; }
+    textarea:focus {
+      border-color: rgba(201,153,58,.5);
+      box-shadow: 0 0 0 3px rgba(201,153,58,.08);
+    }
+    .send-btn {
+      width: 52px; height: 52px;
+      border-radius: 12px;
+      border: 1px solid rgba(201,153,58,.4);
+      background: linear-gradient(135deg, rgba(201,153,58,.2), rgba(201,153,58,.05));
+      color: var(--gold);
+      font-size: 1.3rem;
+      cursor: pointer;
+      transition: all .2s;
+      display: flex;
+      align-items: center;
+      justify-content: center;
+      flex-shrink: 0;
+    }
+    .send-btn:hover:not(:disabled) {
+      background: linear-gradient(135deg, rgba(201,153,58,.35), rgba(201,153,58,.15));
+      border-color: var(--gold);
+      transform: translateY(-1px);
+      box-shadow: 0 4px 16px rgba(201,153,58,.2);
+    }
+    .send-btn:disabled { opacity: .3; cursor: not-allowed; transform: none; }
+    .input-hint {
+      font-size: .72rem;
+      color: var(--muted);
+      margin-top: 8px;
+      text-align: center;
+      font-style: italic;
+    }
+    /* ── Divider line ─────────────────────────────────────────── */
+    .ornament {
+      text-align: center;
+      color: var(--border);
+      font-size: .8rem;
+      letter-spacing: .4em;
+      margin: 4px 0;
+    }
+  </style>
+</head>
+<body>
+  <div class="app">
+    <!-- Header -->
+    <header>
+      <div class="mandala">✦</div>
+      <h1>Life Guide</h1>
+      <p class="subtitle">Wisdom from the Bhagavad Gita, Quran, Bible &amp; Guru Granth Sahib</p>
+      <div class="badges">
+        <span class="badge badge-gita">Bhagavad Gita</span>
+        <span class="badge badge-quran">Quran</span>
+        <span class="badge badge-bible">Bible</span>
+        <span class="badge badge-granth">Guru Granth Sahib</span>
+      </div>
+    </header>
+    <!-- Chat Window -->
+    <div class="chat-window" id="chatWindow">
+      <div class="welcome" id="welcomePane">
+        <div class="welcome-icon">🕊️</div>
+        <h2>"Seek, and it shall be given unto you"</h2>
+        <p>Ask any spiritual or philosophical question. Answers are drawn exclusively from the Bhagavad Gita, Quran, Bible, and Guru Granth Sahib.</p>
+        <div class="suggested-queries">
+          <button onclick="askSuggested(this)">What do the scriptures say about forgiveness?</button>
+          <button onclick="askSuggested(this)">How should one face fear and death?</button>
+          <button onclick="askSuggested(this)">What is the purpose of prayer and worship?</button>
+          <button onclick="askSuggested(this)">What is the nature of the soul according to each religion?</button>
+          <button onclick="askSuggested(this)">What do the scriptures teach about humility and selfless service?</button>
+        </div>
+      </div>
+    </div>
+    <!-- Input -->
+    <div class="input-area">
+      <div class="input-row">
+        <textarea
+          id="questionInput"
+          placeholder="Ask a question from the sacred texts…"
+          rows="1"
+          onkeydown="handleKey(event)"
+          oninput="autoResize(this)"
+        ></textarea>
+        <button class="send-btn" id="sendBtn" onclick="sendQuestion()" title="Ask (Enter)">
+          ✦
+        </button>
+      </div>
+      <p class="input-hint">Press Enter to ask · Shift+Enter for new line · Answers grounded strictly in the sacred texts</p>
+    </div>
+  </div>
+  <script>
+    const API_BASE = "http://localhost:8000";
+    let isLoading = false;
+    // ── Helpers ────────────────────────────────────────────────
+    function getSourceClass(book) {
+      const b = book.toLowerCase();
+      if (b.includes("gita")) return "source-gita";
+      if (b.includes("quran") || b.includes("koran")) return "source-quran";
+      if (b.includes("bible") || b.includes("testament")) return "source-bible";
+      if (b.includes("granth") || b.includes("guru")) return "source-granth";
+      return "source-other";
+    }
+    function hideWelcome() {
+      const w = document.getElementById("welcomePane");
+      if (w) w.remove();
+    }
+    function scrollToBottom() {
+      const w = document.getElementById("chatWindow");
+      w.scrollTop = w.scrollHeight;
+    }
+    function autoResize(el) {
+      el.style.height = "auto";
+      el.style.height = Math.min(el.scrollHeight, 140) + "px";
+    }
+    function formatAnswer(text) {
+      // Convert markdown-ish bold (**text**) to <strong>
+      text = text.replace(/\*\*(.*?)\*\*/g, "<strong>$1</strong>");
+      // Wrap paragraphs
+      return text.split(/\n\n+/).filter(p => p.trim()).map(p => `<p>${p.trim()}</p>`).join("");
+    }
+    // ── Append message to chat ─────────────────────────────────
+    function appendUserMessage(question) {
+      const w = document.getElementById("chatWindow");
+      const div = document.createElement("div");
+      div.className = "message message-user";
+      div.innerHTML = `
+        <span class="msg-label">You</span>
+        <div class="msg-bubble">${escapeHtml(question)}</div>
+      `;
+      w.appendChild(div);
+      scrollToBottom();
+    }
+    function appendLoading() {
+      const w = document.getElementById("chatWindow");
+      const div = document.createElement("div");
+      div.className = "message message-assistant";
+      div.id = "loadingMsg";
+      div.innerHTML = `
+        <span class="msg-label">Sacred Texts</span>
+        <div class="loading">
+          <div class="loading-dots"><span></span><span></span><span></span></div>
+          <span class="loading-text">Consulting the scriptures…</span>
+        </div>
+      `;
+      w.appendChild(div);
+      scrollToBottom();
+      return div;
+    }
+    function replaceLoadingWithAnswer(loadingEl, data) {
+      const w = document.getElementById("chatWindow");
+      // Build source tags
+      const sourceTags = (data.sources || []).map(s => {
+        const cls = getSourceClass(s.book);
+        return `<span class="source-tag ${cls}" title="Page ${s.page}">📖 ${s.book}</span>`;
+      }).join("");
+      const sourcesHtml = sourceTags ? `
+        <div class="sources">
+          <div class="sources-label">References</div>
+          <div class="source-tags">${sourceTags}</div>
+        </div>
+      ` : "";
+      loadingEl.innerHTML = `
+        <span class="msg-label">Sacred Texts</span>
+        <div class="msg-bubble">${formatAnswer(data.answer)}</div>
+        ${sourcesHtml}
+      `;
+      scrollToBottom();
+    }
+    function replaceLoadingWithError(loadingEl, msg) {
+      loadingEl.innerHTML = `
+        <span class="msg-label">Error</span>
+        <div class="error-bubble">⚠️ ${escapeHtml(msg)}</div>
+      `;
+      scrollToBottom();
+    }
+    function escapeHtml(str) {
+      return str.replace(/&/g,"&amp;").replace(/</g,"&lt;").replace(/>/g,"&gt;");
+    }
+    // ── Send question ──────────────────────────────────────────
+    async function sendQuestion() {
+      if (isLoading) return;
+      const input = document.getElementById("questionInput");
+      const question = input.value.trim();
+      if (!question) return;
+      hideWelcome();
+      isLoading = true;
+      document.getElementById("sendBtn").disabled = true;
+      input.value = "";
+      input.style.height = "auto";
+      appendUserMessage(question);
+      const loadingEl = appendLoading();
+      try {
+        const res = await fetch(`${API_BASE}/ask`, {
+          method: "POST",
+          headers: { "Content-Type": "application/json" },
+          body: JSON.stringify({ question }),
+        });
+        if (!res.ok) {
+          const err = await res.json().catch(() => ({ detail: res.statusText }));
+          throw new Error(err.detail || "Server error");
+        }
+        const data = await res.json();
+        replaceLoadingWithAnswer(loadingEl, data);
+      } catch (err) {
+        let msg = err.message;
+        if (msg.includes("fetch") || msg.includes("NetworkError") || msg.includes("Failed")) {
+          msg = "Cannot reach the server. Make sure `python app.py` is running on localhost:8000.";
+        }
+        replaceLoadingWithError(loadingEl, msg);
+      } finally {
+        isLoading = false;
+        document.getElementById("sendBtn").disabled = false;
+        input.focus();
+      }
+    }
+    function askSuggested(btn) {
+      const input = document.getElementById("questionInput");
+      input.value = btn.textContent;
+      autoResize(input);
+      sendQuestion();
+    }
+    function handleKey(e) {
+      if (e.key === "Enter" && !e.shiftKey) {
+        e.preventDefault();
+        sendQuestion();
+      }
+    }
+  </script>
+</body>
+</html>

ingest.py ADDED Viewed

	@@ -0,0 +1,179 @@

+"""
+ingest.py — Step 1: Build the vector knowledge base from religious PDFs.
+Run this ONCE before starting the app:
+    python ingest.py
+It will:
+1. Load all PDFs from the ./books/ directory
+2. Split them into overlapping semantic chunks
+3. Embed each chunk using NVIDIA's llama-nemotron embedding model
+4. Persist everything into a local ChromaDB vector store
+"""
+import os
+import sys
+from pathlib import Path
+from dotenv import load_dotenv
+from langchain_community.document_loaders import PyPDFLoader, PyMuPDFLoader
+from langchain_text_splitters import RecursiveCharacterTextSplitter
+from langchain_nvidia_ai_endpoints import NVIDIAEmbeddings
+from langchain_chroma import Chroma
+load_dotenv()
+# ─── Configuration ────────────────────────────────────────────────────────────
+BOOKS_DIR = Path("./books")
+CHROMA_DB_PATH = os.getenv("CHROMA_DB_PATH", "./chroma_db")
+COLLECTION_NAME = os.getenv("COLLECTION_NAME", "sacred_texts")
+NVIDIA_API_KEY = os.getenv("NVIDIA_API_KEY")
+# Mapping of filename keywords → friendly book name stored in metadata
+BOOK_NAME_MAP = {
+    "gita": "Bhagavad Gita",
+    "bhagavad": "Bhagavad Gita",
+    "quran": "Quran",
+    "koran": "Quran",
+    "bible": "Bible",
+    "testament": "Bible",
+    "granth": "Guru Granth Sahib",    # ← ADD
+    "guru": "Guru Granth Sahib",      # ← ADD
+}
+# Chunk settings — tuned for religious texts (verses are short)
+CHUNK_SIZE = 800       # characters per chunk
+CHUNK_OVERLAP = 150    # overlap to preserve verse context across boundaries
+# ─── Helpers ──────────────────────────────────────────────────────────────────
+def detect_book_name(filename: str) -> str:
+    """Infer the book's display name from its filename."""
+    name_lower = filename.lower()
+    for keyword, book_name in BOOK_NAME_MAP.items():
+        if keyword in name_lower:
+            return book_name
+    # Fallback: use the filename stem, title-cased
+    return Path(filename).stem.replace("_", " ").title()
+def load_pdf(pdf_path: Path) -> list:
+    """
+    Load a PDF using PyMuPDF (preferred) or PyPDF as fallback.
+    Returns a list of LangChain Document objects.
+    """
+    try:
+        loader = PyMuPDFLoader(str(pdf_path))
+        print(f"  📖 Loading with PyMuPDF: {pdf_path.name}")
+    except Exception:
+        loader = PyPDFLoader(str(pdf_path))
+        print(f"  📖 Loading with PyPDF: {pdf_path.name}")
+    docs = loader.load()
+    print(f"     → {len(docs)} pages loaded")
+    return docs
+def tag_documents(docs: list, book_name: str, source_file: str) -> list:
+    """
+    Enrich each document's metadata with:
+    - book: display name (e.g. "Bhagavad Gita")
+    - source_file: original filename
+    """
+    for doc in docs:
+        doc.metadata["book"] = book_name
+        doc.metadata["source_file"] = source_file
+        # Keep the page number if already present from the loader
+        if "page" not in doc.metadata:
+            doc.metadata["page"] = 0
+    return docs
+# ─── Main Ingestion ───────────────────────────────────────────────────────────
+def ingest():
+    if not NVIDIA_API_KEY:
+        print("❌  NVIDIA_API_KEY not set. Add it to your .env file.")
+        sys.exit(1)
+    if not BOOKS_DIR.exists():
+        print(f"❌  Books directory not found: {BOOKS_DIR.resolve()}")
+        print("    Create a ./books/ folder and add your PDFs there.")
+        sys.exit(1)
+    pdf_files = list(BOOKS_DIR.glob("*.pdf"))
+    if not pdf_files:
+        print(f"❌  No PDF files found in {BOOKS_DIR.resolve()}")
+        sys.exit(1)
+    print(f"\n🕊️  Sacred Texts RAG — Ingestion Pipeline")
+    print(f"{'─' * 50}")
+    print(f"📂  Books directory : {BOOKS_DIR.resolve()}")
+    print(f"💾  ChromaDB path   : {Path(CHROMA_DB_PATH).resolve()}")
+    print(f"📚  PDFs found      : {len(pdf_files)}")
+    print(f"{'─' * 50}\n")
+    # ── Step 1: Load all PDFs ────────────────────────────────────────────────
+    all_docs = []
+    for pdf_path in pdf_files:
+        book_name = detect_book_name(pdf_path.name)
+        print(f"📕  {book_name}")
+        raw_docs = load_pdf(pdf_path)
+        tagged_docs = tag_documents(raw_docs, book_name, pdf_path.name)
+        all_docs.extend(tagged_docs)
+        print(f"     ✅  Tagged as '{book_name}'\n")
+    print(f"📄  Total pages loaded: {len(all_docs)}")
+    # ── Step 2: Split into chunks ────────────────────────────────────────────
+    print(f"\n✂️   Splitting into chunks (size={CHUNK_SIZE}, overlap={CHUNK_OVERLAP})...")
+    splitter = RecursiveCharacterTextSplitter(
+        chunk_size=CHUNK_SIZE,
+        chunk_overlap=CHUNK_OVERLAP,
+        separators=["\n\n", "\n", ". ", " ", ""],  # Respect paragraph/verse boundaries
+    )
+    chunks = splitter.split_documents(all_docs)
+    print(f"     → {len(chunks)} chunks created")
+    # ── Step 3: Embed & store ────────────────────────────────────────────────
+    print(f"\n🔢  Initialising NVIDIA embedding model (llama-nemotron-embed-vl-1b-v2)...")
+    embeddings = NVIDIAEmbeddings(
+        model="nvidia/llama-nemotron-embed-vl-1b-v2",
+        api_key=NVIDIA_API_KEY,
+        truncate="NONE",
+    )
+    print(f"💾  Building ChromaDB vector store — this may take a few minutes...")
+    print(f"     (Embedding {len(chunks)} chunks...)\n")
+    # Process in batches to avoid rate limits
+    BATCH_SIZE = 100
+    vector_store = None
+    for i in range(0, len(chunks), BATCH_SIZE):
+        batch = chunks[i : i + BATCH_SIZE]
+        batch_num = i // BATCH_SIZE + 1
+        total_batches = (len(chunks) + BATCH_SIZE - 1) // BATCH_SIZE
+        print(f"  Batch {batch_num}/{total_batches}: embedding {len(batch)} chunks...")
+        if vector_store is None:
+            vector_store = Chroma.from_documents(
+                documents=batch,
+                embedding=embeddings,
+                persist_directory=CHROMA_DB_PATH,
+                collection_name=COLLECTION_NAME,
+            )
+        else:
+            vector_store.add_documents(batch)
+    print(f"\n{'─' * 50}")
+    print(f"✅  Ingestion complete!")
+    print(f"    📦  {len(chunks)} chunks stored in ChromaDB")
+    print(f"    📂  Location: {Path(CHROMA_DB_PATH).resolve()}")
+    print(f"\n👉  Now run: python app.py")
+    print(f"{'─' * 50}\n")
+if __name__ == "__main__":
+    ingest()

rag_chain.py ADDED Viewed

	@@ -0,0 +1,240 @@

+"""
+rag_chain.py — Core RAG chain using LangChain + Gemini.
+KEY FIX: Uses per-book retrieval (guaranteed slots per scripture) instead of
+a single similarity search — so no book gets starved from the context window
+when the query is semantically closer to another book's language.
+This module exposes a single function:
+    answer = query_sacred_texts(user_question)
+Returns a dict with:
+    {
+        "answer": "...",
+        "sources": [
+            {"book": "Bhagavad Gita", "page": 42, "snippet": "..."},
+            ...
+        ]
+    }
+"""
+import os
+from dotenv import load_dotenv
+from langchain_nvidia_ai_endpoints import NVIDIAEmbeddings
+from langchain_google_genai import ChatGoogleGenerativeAI
+from langchain_chroma import Chroma
+from langchain_core.prompts import ChatPromptTemplate
+from langchain_core.output_parsers import StrOutputParser
+load_dotenv()
+GEMINI_API_KEY = os.getenv("GEMINI_API_KEY")
+NVIDIA_API_KEY = os.getenv("NVIDIA_API_KEY")
+CHROMA_DB_PATH = os.getenv("CHROMA_DB_PATH", "./chroma_db")
+COLLECTION_NAME = os.getenv("COLLECTION_NAME", "sacred_texts")
+# Chunks retrieved PER BOOK — guarantees every scripture contributes to the answer
+CHUNKS_PER_BOOK = int(os.getenv("CHUNKS_PER_BOOK", "3"))
+# All books currently in the knowledge base — add new books here as you ingest them
+KNOWN_BOOKS = [
+    "Bhagavad Gita",
+    "Quran",
+    "Bible",
+    "Guru Granth Sahib",
+]
+# ─── System Prompt ────────────────────────────────────────────────────────────
+SYSTEM_PROMPT = """You are a scholarly and compassionate guide to sacred scriptures.
+You have deep knowledge of the Bhagavad Gita, the Quran, the Bible, and the Guru Granth Sahib.
+STRICT RULES you must ALWAYS follow:
+1. Answer ONLY using the provided context passages. Do NOT use any external knowledge.
+2. If a specific book's passages are provided but not relevant to the question, skip that book.
+3. If NONE of the context is relevant, say: "The provided texts do not directly address this question."
+4. Always cite which book(s) your answer draws from.
+5. When the question asks to COMPARE books (e.g. "what do Quran and Gita say"), you MUST
+   address EACH of those books separately, then synthesise the common thread.
+6. Be respectful and neutral toward all faiths — treat each text with equal reverence.
+7. Do NOT speculate, invent verses, or add information beyond the context.
+FORMAT your response as:
+- A clear, thoughtful answer (2–4 paragraphs)
+- A "📚 Sources" section listing each book referenced with the key insight drawn from it
+Context passages from the sacred texts (guaranteed passages from each book):
+────────────────────────────────────────
+{context}
+────────────────────────────────────────
+"""
+HUMAN_PROMPT = "Question: {question}"
+# ─── Embeddings & Vector Store ────────────────────────────────────────────────
+def get_embeddings():
+    return NVIDIAEmbeddings(
+        model="nvidia/llama-nemotron-embed-vl-1b-v2",
+        api_key=NVIDIA_API_KEY,
+        truncate="NONE",
+    )
+def get_vector_store(embeddings):
+    return Chroma(
+        persist_directory=CHROMA_DB_PATH,
+        embedding_function=embeddings,
+        collection_name=COLLECTION_NAME,
+    )
+# ─── Per-Book Retrieval ───────────────────────────────────────────────────────
+def retrieve_per_book(question: str, vector_store: Chroma) -> list:
+    """
+    Retrieve CHUNKS_PER_BOOK chunks from EACH known book independently,
+    using a metadata filter. This guarantees every scripture is represented
+    in the context — no book can be crowded out by higher-scoring chunks
+    from another book.
+    """
+    all_docs = []
+    for book in KNOWN_BOOKS:
+        try:
+            results = vector_store.similarity_search(
+                query=question,
+                k=CHUNKS_PER_BOOK,
+                filter={"book": book},          # ← metadata filter: only this book
+            )
+            if results:
+                print(f"  📖  {book}: {len(results)} chunk(s) retrieved")
+            else:
+                print(f"  ⚠️   {book}: 0 chunks found (not ingested?)")
+            all_docs.extend(results)
+        except Exception as e:
+            print(f"  ❌  {book}: retrieval error — {e}")
+    return all_docs
+# ─── Format Retrieved Docs ────────────────────────────────────────────────────
+def format_docs(docs: list) -> str:
+    """
+    Format retrieved documents grouped by book for clarity.
+    Each chunk is labelled with book and page number.
+    """
+    # Group by book to keep context readable
+    by_book: dict[str, list] = {}
+    for doc in docs:
+        book = doc.metadata.get("book", "Unknown")
+        by_book.setdefault(book, []).append(doc)
+    sections = []
+    for book, book_docs in by_book.items():
+        header = f"═══ {book} ═══"
+        chunks = []
+        for i, doc in enumerate(book_docs, 1):
+            page = doc.metadata.get("page", "?")
+            chunks.append(f"  [{i}] (Page {page}): {doc.page_content.strip()}")
+        sections.append(header + "\n" + "\n\n".join(chunks))
+    return "\n\n".join(sections)
+# ─── Build the RAG Chain ──────────────────────────────────────────────────────
+def build_chain():
+    """Build and return the LLM chain and vector store."""
+    embeddings = get_embeddings()
+    vector_store = get_vector_store(embeddings)
+    llm = ChatGoogleGenerativeAI(
+        model="gemini-2.5-flash-lite",
+        google_api_key=GEMINI_API_KEY,
+        temperature=0.2,
+        max_output_tokens=1500,
+    )
+    prompt = ChatPromptTemplate.from_messages([
+        ("system", SYSTEM_PROMPT),
+        ("human", HUMAN_PROMPT),
+    ])
+    # Chain: prompt → LLM → string output
+    # (retrieval is handled manually in query_sacred_texts for per-book control)
+    llm_chain = prompt | llm | StrOutputParser()
+    return llm_chain, vector_store
+# ─── Public API ───────────────────────────────────────────────────────────────
+_llm_chain = None
+_vector_store = None
+def query_sacred_texts(question: str) -> dict:
+    """
+    Query the sacred texts knowledge base with guaranteed per-book retrieval.
+    Args:
+        question: The user's spiritual/philosophical question.
+    Returns:
+        {
+            "answer": str,
+            "sources": list[dict]   # [{book, page, snippet}, ...]
+        }
+    """
+    global _llm_chain, _vector_store
+    if _llm_chain is None:
+        print("🔧  Initialising RAG chain (first call)...")
+        _llm_chain, _vector_store = build_chain()
+    # Step 1: Retrieve per-book (guaranteed slots for every scripture)
+    print(f"\n🔍  Retrieving {CHUNKS_PER_BOOK} chunks per book for: '{question}'")
+    source_docs = retrieve_per_book(question, _vector_store)
+    if not source_docs:
+        return {
+            "answer": "No content found in the knowledge base. Please run ingest.py first.",
+            "sources": [],
+        }
+    # Step 2: Format context grouped by book
+    context = format_docs(source_docs)
+    # Step 3: Generate answer
+    answer = _llm_chain.invoke({"context": context, "question": question})
+    # Step 4: Build deduplicated source list for the UI
+    seen_books = set()
+    sources = []
+    for doc in source_docs:
+        book = doc.metadata.get("book", "Unknown")
+        page = doc.metadata.get("page", "?")
+        snippet = doc.page_content[:200].strip() + "..."
+        if book not in seen_books:
+            seen_books.add(book)
+            sources.append({"book": book, "page": page, "snippet": snippet})
+    return {
+        "answer": answer,
+        "sources": sources,
+    }
+# ─── Quick CLI Test ───────────────────────────────────────────────────────────
+if __name__ == "__main__":
+    test_q = "In what aspects do the Quran and Gita teach the same thing?"
+    print(f"\n🔍  Test query: {test_q}\n")
+    result = query_sacred_texts(test_q)
+    print("📝  Answer:\n")
+    print(result["answer"])
+    print("\n📚  Sources retrieved:")
+    for s in result["sources"]:
+        print(f"  - {s['book']} (page {s['page']})")

requirements.txt ADDED Viewed

	@@ -0,0 +1,26 @@

+# Core LangChain
+langchain
+langchain-google-genai
+langchain-community
+langchain-chroma
+langchain-nvidia-ai-endpoints
+langchain-text-splitters
+# Vector Store
+chromadb
+# PDF Loading
+pypdf
+pymupdf          # Better PDF parsing (optional but recommended)
+# Google Gemini
+google-generativeai
+# API Server
+fastapi
+uvicorn[standard]
+python-multipart
+# Utilities
+python-dotenv
+pydantic