metadata

title: MedChatBot
emoji: 💊
colorFrom: blue
colorTo: green
sdk: docker
pinned: false

MedChatBot

A medical chatbot application that uses RAG (Retrieval-Augmented Generation) architecture to answer medical questions based on medical literature. The system combines Google Gemini 2.5 Pro as the language model with Pinecone vector database for efficient document retrieval.

Technology Stack

Backend: Flask
Language Model: Google Gemini 2.5 Pro
Vector Database: Pinecone
Embeddings: HuggingFace sentence-transformers (all-MiniLM-L6-v2)
Document Processing: LangChain, PyPDF
Frontend: HTML/CSS/JavaScript

Installation & Setup

Step 1: Clone the Repository

git clone https://github.com/TMTien31/MedChatBot.git
cd MedChatBot

Step 2: Create Virtual Environment

# Create virtual environment
python -m venv venv

# Activate virtual environment
# On Windows:
venv\Scripts\activate

# On macOS/Linux:
source venv/bin/activate

Step 3: Install Dependencies

pip install -r requirements.txt

Step 4: Get API Keys

Google Gemini API Key:

Go to Google AI Studio
Create a new API key
Copy the generated key

Pinecone API Key:

Sign up at Pinecone
Go to your dashboard
Copy your API key from the "API Keys" section

Step 5: Create Environment File

Create a .env file in the project root directory:

# Create .env file
touch .env  # On macOS/Linux
# or create manually on Windows

Add your API keys to the .env file:

PINECONE_API_KEY=your_pinecone_api_key_here
GEMINI_API_KEY=your_gemini_api_key_here

Step 6: Prepare Medical Documents

Place your PDF medical documents in the Data/ folder
The project includes "Gale Encyclopedia of Medicine Vol. 1 (A-B).pdf" by default
You can add more medical PDFs to expand the knowledge base

Step 7: Create Vector Index (Run Once)

Important: This step only needs to be run once initially, or whenever you add new documents to the Data/ folder.

python store_index.py

This script will:

Read all PDF files from the Data/ directory
Split text into 500-character chunks with 20-character overlap
Generate embeddings using sentence-transformers
Create and populate a Pinecone index named "medchatbot"

Note: This process may take several minutes depending on the size of your documents.

Running the Application

Start the Flask Server

python app.py

Access the Application

Open your web browser
Navigate to: http://0.0.0.0:8080 or http://localhost:8080
You should see the medical chatbot interface