Spaces:

Qar-Raz
/

NLP-RAG

Running

App Files Files Community

Qar-Raz commited on about 3 hours ago

Commit

d73bcb6

verified ·

1 Parent(s): c64aaec

Sync backend Docker context from GitHub main

Browse files

Files changed (2) hide show

README.md +136 -1
requirements.txt +1 -2

README.md CHANGED Viewed

@@ -9,4 +9,139 @@ license: mit
 short_description: NLP Spring 2026 Project 1
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 short_description: NLP Spring 2026 Project 1
 ---
+RAG-based Question-Answering System for Cognitive Behavior Therapy (CBT)
+## Overview
+This project is a Retrieval-Augmented Generation (RAG) system built to answer CBT-related questions using grounded evidence from source manuals instead of relying on generic model knowledge. It combines hybrid retrieval, re-ranking, and strict response constraints so the assistant stays accurate, clinically focused, and less prone to hallucinations.
+## Index
+- [Overview](#overview)
+- [Live Demo and Repository](#live-demo-and-repository)
+- [Live Web Interface](#live-web-interface)
+- [Tech Stack](#tech-stack)
+- [System Architecture](#system-architecture)
+- [Key Features](#key-features)
+- [Installation and Setup](#installation-and-setup)
+- [Configuration](#configuration)
+- [Testing](#testing)
+- [Contributors](#contributors)
+## Live Demo and Repository
+- Live Demo: https://rag-as-3-nlp.vercel.app/
+- Code Repository: https://github.com/ramailkk/RAG-AS3-NLP
+## Live Web Interface
+Add the frontend screenshots here.
+### Frontend Image 1
+<!-- Add image here -->
+### Frontend Image 2
+<!-- Add image here -->
+## Tech Stack
+- Frontend: Vercel (Node.js/React)
+- Backend: Hugging Face Spaces (FastAPI)
+- Vector Database: Pinecone
+- Embeddings: jinaai/jina-embeddings-v2-small-en
+- LLMs: Llama-3-8B (Primary), TinyAya, Mistral-7B, Qwen-2.5
+- Re-ranking: Voyage AI (rerank-2.5) and Cross-Encoder (ms-marco-MiniLM-L-6-v2)
+- Retrieval: Hybrid Search (Dense + BM25 Sparse)
+## System Architecture
+The system operates through a high-precision multi-stage pipeline to ensure clinical safety and data grounding:
+- Hybrid Retrieval: Simultaneously queries dense vector indices for semantic intent and sparse BM25 indices for specific clinical terminology such as Socratic Questioning or Cognitive Distortions.
+- Fusion & Re-ranking: Uses Reciprocal Rank Fusion (RRF) to merge results, followed by a Cross-Encoder stage to re-evaluate the relevance of chunks against the user query.
+- Diversity Filtering (MMR): Implements Maximal Marginal Relevance to ensure the context provided to the LLM is not redundant.
+- Prompt Engineering: Employs a specialized persona that acts as an empathetic CBT therapist with strict grounding constraints to prevent the use of outside knowledge.
+- Automated Evaluation: An LLM-as-a-Judge framework calculates:
+  - Faithfulness: Verifying claims against the source document.
+  - Relevancy: Ensuring the answer directly addresses the user's query.
+## Key Features
+- Clinical Domain Focus: Optimized for high-density information found in mental health manuals.
+- Zero Tolerance for Hallucinations: Includes a fallback protocol to state when information is missing rather than inventing therapeutic advice.
+- Advanced Chunking: Uses sentence-level and recursive character splitting to preserve the logical flow of therapeutic guidelines and patient transcripts.
+- Multi-Model Support: Tested across multiple LLMs to find the best balance between latency and grounding.
+## Installation and Setup
+### Backend Setup
+The backend handles document processing, Pinecone vector operations, and the hybrid retrieval logic.
+1. Initialize Virtual Environment:
+	```bash
+	python -m venv .venv
+	# Windows
+	source .venv/Scripts/activate
+	# Linux/Mac
+	source .venv/bin/activate
+	```
+2. Install Dependencies:
+	```bash
+	pip install -r requirements.txt
+	```
+3. Launch API Server:
+	```bash
+	uvicorn backend.api:app --reload --host 0.0.0.0 --port 8000
+	```
+### Frontend Setup
+The frontend provides the interactive chat interface and real-time evaluation scores.
+1. Navigate and Install:
+	```bash
+	cd frontend
+	npm install
+	```
+2. Start Development Server:
+	```bash
+	npm run dev
+	```
+## Configuration
+To replicate the system, ensure your environment variables contain valid API keys for:
+- Pinecone for vector storage
+- OpenRouter or Hugging Face Inference API for LLM access
+- Voyage AI for re-ranking
+## Testing
+Run `test.py` to execute the retrieval test suite and generate a complete Markdown report of the results.
+```bash
+python test.py
+```
+This script evaluates multiple test queries across the configured chunking techniques and retrieval strategies, then writes the full output to `retrieval_report.md`.
+## Contributors
+- Ramail Khan ([ramailkk](https://github.com/ramailkk))
+- Qamar Raza ([Qar-Raz](https://github.com/Qar-Raz))
+- Muddasir Javed ([bsparx](https://github.com/bsparx))

requirements.txt CHANGED Viewed

@@ -96,5 +96,4 @@ groq==1.1.2
 jiter==0.13.0
 openai==2.30.0
 pinecone-text>=0.11.0
-voyageai==0.3.7

 jiter==0.13.0
 openai==2.30.0
 pinecone-text>=0.11.0
+voyageai==0.3.7