Qar-Raz commited on
Commit
d73bcb6
·
verified ·
1 Parent(s): c64aaec

Sync backend Docker context from GitHub main

Browse files
Files changed (2) hide show
  1. README.md +136 -1
  2. requirements.txt +1 -2
README.md CHANGED
@@ -9,4 +9,139 @@ license: mit
9
  short_description: NLP Spring 2026 Project 1
10
  ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  short_description: NLP Spring 2026 Project 1
10
  ---
11
 
12
+ RAG-based Question-Answering System for Cognitive Behavior Therapy (CBT)
13
+
14
+ ## Overview
15
+
16
+ This project is a Retrieval-Augmented Generation (RAG) system built to answer CBT-related questions using grounded evidence from source manuals instead of relying on generic model knowledge. It combines hybrid retrieval, re-ranking, and strict response constraints so the assistant stays accurate, clinically focused, and less prone to hallucinations.
17
+
18
+ ## Index
19
+
20
+ - [Overview](#overview)
21
+ - [Live Demo and Repository](#live-demo-and-repository)
22
+ - [Live Web Interface](#live-web-interface)
23
+ - [Tech Stack](#tech-stack)
24
+ - [System Architecture](#system-architecture)
25
+ - [Key Features](#key-features)
26
+ - [Installation and Setup](#installation-and-setup)
27
+ - [Configuration](#configuration)
28
+ - [Testing](#testing)
29
+ - [Contributors](#contributors)
30
+
31
+ ## Live Demo and Repository
32
+
33
+ - Live Demo: https://rag-as-3-nlp.vercel.app/
34
+ - Code Repository: https://github.com/ramailkk/RAG-AS3-NLP
35
+
36
+ ## Live Web Interface
37
+
38
+ Add the frontend screenshots here.
39
+
40
+ ### Frontend Image 1
41
+
42
+ <!-- Add image here -->
43
+
44
+ ### Frontend Image 2
45
+
46
+ <!-- Add image here -->
47
+
48
+ ## Tech Stack
49
+
50
+ - Frontend: Vercel (Node.js/React)
51
+ - Backend: Hugging Face Spaces (FastAPI)
52
+ - Vector Database: Pinecone
53
+ - Embeddings: jinaai/jina-embeddings-v2-small-en
54
+ - LLMs: Llama-3-8B (Primary), TinyAya, Mistral-7B, Qwen-2.5
55
+ - Re-ranking: Voyage AI (rerank-2.5) and Cross-Encoder (ms-marco-MiniLM-L-6-v2)
56
+ - Retrieval: Hybrid Search (Dense + BM25 Sparse)
57
+
58
+ ## System Architecture
59
+
60
+ The system operates through a high-precision multi-stage pipeline to ensure clinical safety and data grounding:
61
+
62
+ - Hybrid Retrieval: Simultaneously queries dense vector indices for semantic intent and sparse BM25 indices for specific clinical terminology such as Socratic Questioning or Cognitive Distortions.
63
+ - Fusion & Re-ranking: Uses Reciprocal Rank Fusion (RRF) to merge results, followed by a Cross-Encoder stage to re-evaluate the relevance of chunks against the user query.
64
+ - Diversity Filtering (MMR): Implements Maximal Marginal Relevance to ensure the context provided to the LLM is not redundant.
65
+ - Prompt Engineering: Employs a specialized persona that acts as an empathetic CBT therapist with strict grounding constraints to prevent the use of outside knowledge.
66
+ - Automated Evaluation: An LLM-as-a-Judge framework calculates:
67
+ - Faithfulness: Verifying claims against the source document.
68
+ - Relevancy: Ensuring the answer directly addresses the user's query.
69
+
70
+ ## Key Features
71
+
72
+ - Clinical Domain Focus: Optimized for high-density information found in mental health manuals.
73
+ - Zero Tolerance for Hallucinations: Includes a fallback protocol to state when information is missing rather than inventing therapeutic advice.
74
+ - Advanced Chunking: Uses sentence-level and recursive character splitting to preserve the logical flow of therapeutic guidelines and patient transcripts.
75
+ - Multi-Model Support: Tested across multiple LLMs to find the best balance between latency and grounding.
76
+
77
+ ## Installation and Setup
78
+
79
+ ### Backend Setup
80
+
81
+ The backend handles document processing, Pinecone vector operations, and the hybrid retrieval logic.
82
+
83
+ 1. Initialize Virtual Environment:
84
+
85
+ ```bash
86
+ python -m venv .venv
87
+ # Windows
88
+ source .venv/Scripts/activate
89
+ # Linux/Mac
90
+ source .venv/bin/activate
91
+ ```
92
+
93
+ 2. Install Dependencies:
94
+
95
+ ```bash
96
+ pip install -r requirements.txt
97
+ ```
98
+
99
+ 3. Launch API Server:
100
+
101
+ ```bash
102
+ uvicorn backend.api:app --reload --host 0.0.0.0 --port 8000
103
+ ```
104
+
105
+ ### Frontend Setup
106
+
107
+ The frontend provides the interactive chat interface and real-time evaluation scores.
108
+
109
+ 1. Navigate and Install:
110
+
111
+ ```bash
112
+ cd frontend
113
+ npm install
114
+ ```
115
+
116
+ 2. Start Development Server:
117
+
118
+ ```bash
119
+ npm run dev
120
+ ```
121
+
122
+ ## Configuration
123
+
124
+ To replicate the system, ensure your environment variables contain valid API keys for:
125
+
126
+ - Pinecone for vector storage
127
+ - OpenRouter or Hugging Face Inference API for LLM access
128
+ - Voyage AI for re-ranking
129
+
130
+ ## Testing
131
+
132
+ Run `test.py` to execute the retrieval test suite and generate a complete Markdown report of the results.
133
+
134
+ ```bash
135
+ python test.py
136
+ ```
137
+
138
+ This script evaluates multiple test queries across the configured chunking techniques and retrieval strategies, then writes the full output to `retrieval_report.md`.
139
+
140
+ ## Contributors
141
+
142
+ - Ramail Khan ([ramailkk](https://github.com/ramailkk))
143
+ - Qamar Raza ([Qar-Raz](https://github.com/Qar-Raz))
144
+ - Muddasir Javed ([bsparx](https://github.com/bsparx))
145
+
146
+
147
+
requirements.txt CHANGED
@@ -96,5 +96,4 @@ groq==1.1.2
96
  jiter==0.13.0
97
  openai==2.30.0
98
  pinecone-text>=0.11.0
99
- voyageai==0.3.7
100
-
 
96
  jiter==0.13.0
97
  openai==2.30.0
98
  pinecone-text>=0.11.0
99
+ voyageai==0.3.7