RAHMAN00700 commited on
Commit
900b00d
Β·
unverified Β·
1 Parent(s): e65e1ad

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +122 -1
README.md CHANGED
@@ -8,4 +8,125 @@ sdk_version: 1.40.0
8
  app_file: app.py
9
  pinned: false
10
  ---
11
- # chat-with-multiple-doc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  app_file: app.py
9
  pinned: false
10
  ---
11
+
12
+ # Multi-Document Retrieval with Watsonx
13
+
14
+ **A Streamlit-powered app for querying multiple document types using Watsonx and LangChain.**
15
+
16
+ This project allows users to upload various file formats (PDFs, DOCX, CSV, JSON, YAML, HTML, etc.) and retrieve contextually accurate responses using Watsonx LLM models and LangChain. The app provides a seamless interface to perform retrieval-augmented generation (RAG) from uploaded documents.
17
+
18
+ ---
19
+
20
+ ## Features
21
+
22
+ - **File Support**: Supports multiple file formats such as PDFs, Word documents, PowerPoint presentations, CSV, JSON, YAML, HTML, and plain text.
23
+ - **Watsonx LLM Integration**: Utilize IBM Watsonx's LLM models for querying and generating answers.
24
+ - **Embeddings**: Uses `HuggingFace` embeddings for document indexing.
25
+ - **RAG (Retrieval Augmented Generation)**: Combines document-based retrieval with LLMs for accurate responses.
26
+ - **Streamlit Interface**: Provides an intuitive user experience.
27
+
28
+ ---
29
+
30
+ ## Installation
31
+
32
+ Follow these steps to clone and run the project locally:
33
+
34
+ ### Prerequisites
35
+
36
+ 1. **Python 3.8+** installed on your system.
37
+ 2. Install `pip` (Python package manager).
38
+ 3. An IBM Watsonx API key and Project ID.
39
+ 4. Install Git if not already installed.
40
+
41
+ ### Clone the Repository
42
+
43
+ ```bash
44
+ git clone https://github.com/Abd-al-RahmanH/Multi-Doc-Retrieval-Watsonx.git
45
+ cd Multi-Doc-Retrieval-Watsonx
46
+ ```
47
+
48
+ ### Install Dependencies
49
+
50
+ 1. Create a virtual environment (optional but recommended):
51
+
52
+ ```bash
53
+ python -m venv env
54
+ source env/bin/activate # On Windows: .\env\Scripts\activate
55
+ ```
56
+
57
+ 2. Install required Python packages:
58
+
59
+ ```bash
60
+ pip install -r requirements.txt
61
+ ```
62
+
63
+ ### Set Environment Variables
64
+
65
+ Create a `.env` file in the project directory with the following keys:
66
+
67
+ ```env
68
+ WATSONX_API_KEY=<your_watsonx_api_key>
69
+ WATSONX_PROJECT_ID=<your_watsonx_project_id>
70
+ ```
71
+
72
+ ### Run the App
73
+
74
+ 1. Start the Streamlit app by running:
75
+
76
+ ```bash
77
+ streamlit run app.py
78
+ ```
79
+
80
+ 2. Open the URL displayed in your terminal (usually [http://localhost:8501](http://localhost:8501)) to access the app.
81
+
82
+ ---
83
+
84
+ ## How to Use
85
+
86
+ 1. **Upload Documents**: Drag and drop supported files (e.g., PDFs, DOCX, JSON) in the app sidebar.
87
+ 2. **Select Model and Parameters**: Choose a Watsonx model and configure settings like output tokens and decoding methods.
88
+ 3. **Ask Questions**: Enter queries in the chat input to retrieve answers based on the uploaded document.
89
+
90
+ ---
91
+
92
+ ## Project Structure
93
+
94
+ ```plaintext
95
+ Multi-Doc-Retrieval-Watsonx/
96
+ β”œβ”€β”€ app.py # Main application file
97
+ β”œβ”€β”€ requirements.txt # Python dependencies
98
+ β”œβ”€β”€ README.md # Project documentation
99
+ └── .env # Environment variables (not included in repo, create manually)
100
+ ```
101
+
102
+ ---
103
+
104
+ ## Dependencies
105
+
106
+ - **Streamlit**: For building the user interface.
107
+ - **LangChain**: For document retrieval and RAG implementation.
108
+ - **HuggingFace Transformers**: For embedding and vector representation.
109
+ - **Watsonx Foundation Models**: For querying and text generation.
110
+ - **Various Python Libraries**: For file handling, including `pandas`, `python-docx`, `python-pptx`, and more.
111
+
112
+ ---
113
+
114
+ ## Contributing
115
+
116
+ We welcome contributions! If you'd like to improve this project:
117
+
118
+ 1. Fork the repository.
119
+ 2. Create a feature branch: `git checkout -b feature-name`.
120
+ 3. Commit your changes: `git commit -m 'Add a new feature'`.
121
+ 4. Push to the branch: `git push origin feature-name`.
122
+ 5. Open a Pull Request.
123
+
124
+ ---
125
+
126
+ ## License
127
+
128
+ This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
129
+
130
+ ---
131
+
132
+