A newer version of the Streamlit SDK is available:
1.51.0
title: Chat PDF
emoji: π
colorFrom: purple
colorTo: green
sdk: streamlit
sdk_version: 1.35.0
app_file: app.py
pinned: false
Chat PDF
This is a RAG(Retrieval Augmented Generation) project that allows users to upload PDF files, extract text from them, and then ask questions about the content of the PDFs. The application uses the Gemini model from Google Generative AI to generate responses to user questions. Check this out here: https://salahin-chat-pdf.hf.space/
Features
- Upload multiple PDF files
- Extract text from PDF files
- Split text into chunks for efficient processing
- Create a vector store from text chunks using FAISS
- Use a conversational chain to generate responses to user questions
- Display conversation history
How to Use
- Upload one or more PDF files using the file uploader in the sidebar.
- Click the "Submit" button to process the PDF files.
- Ask a question about the content of the PDF files using the chat input box.
- The application will generate a response to your question and display it in the chat window.
Technical Details
The application uses the following libraries:
streamlitfor the web interfacePyPDF2for extracting text from PDF filesLangchainfor creating a conversational chainGoogle Generative AIfor generating responses to user questionsFAISSfor creating a vector store from text chunks
The application uses a recursive character text splitter to split text into chunks of 10,000 characters with an overlap of 1,000 characters. The conversational chain uses a prompt template to generate responses to user questions.
Local Run
To run the application locally, follow these steps:
- Clone the repository using git clone.
https://github.com/MahirSalahin/Chat-PDF.git
- Install the required libraries using
pip install -r requirements.txt.
- Set your GOOGLE_API_KEY in the
.envfile. - Run the application using
streamlit run app.py
License
This project is licensed under the MIT License.