Asaad Almutareb commited on
Commit
7165161
Β·
1 Parent(s): 45f1f60

moved s3 variables to .env

Browse files

added README content
renamed vectorstore dir

Files changed (3) hide show
  1. .gitignore +1 -0
  2. README.md +82 -1
  3. app.py +8 -4
.gitignore CHANGED
@@ -164,5 +164,6 @@ cython_debug/
164
  *.bin
165
  *.pickle
166
  chroma_db/*
 
167
  bin
168
  obj
 
164
  *.bin
165
  *.pickle
166
  chroma_db/*
167
+ vectorstore/*
168
  bin
169
  obj
README.md CHANGED
@@ -1 +1,82 @@
1
- # docu-qachat-demo
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # docu-qachat-demo
2
+ ---
3
+ title: Docs Qachat
4
+ emoji: πŸš€
5
+ colorFrom: gray
6
+ colorTo: gray
7
+ sdk: gradio
8
+ sdk_version: 4.2.0
9
+ app_file: app.py
10
+ pinned: false
11
+ ---
12
+
13
+ # Docs QAchat πŸš€
14
+
15
+ ## Overview
16
+ Docs QAchat is an advanced Documentation AI helper, demonstrating a fine-tuned 7b model's capabilities in aiding users with software documentation. This application integrates technologies like Retrieval-Augmented Generation (RAG), LangChain, Gradio UI, Chroma DB, and FAISS to offer insightful documentation assistance. It's designed to help users navigate and utilize software tools efficiently by retrieving relevant documentation pages and maintaining conversational flow.
17
+
18
+ ## Key Features
19
+ - **AI-Powered Documentation Retrieval:** Utilizes various fine-tuned 7b models for precise and context-aware responses.
20
+ - **Rich User Interface:** Features a user-friendly interface built with Gradio.
21
+ - **Advanced Language Understanding:** Employs LangChain for implementing RAG setups and sophisticated natural language processing.
22
+ - **Efficient Data Handling:** Leverages Chroma DB and FAISS for optimized data storage and retrieval.
23
+ - **Retrieval Chain with Prompt Tuning:** Includes a retrieval chain with a prompt template for prompt tuning.
24
+ - **Conversation Memory:** Incorporates BufferMemory for short-term conversation memory, enhancing conversational flow.
25
+
26
+ ## Models Used
27
+ This setup is tested with the following models:
28
+ - `mistralai/Mistral-7B-v0.1`
29
+ - `mistralai/Mistral-7B-Instruct-v0.1`
30
+ - `HuggingFaceH4/zephyr-7b-beta`
31
+ - `HuggingFaceH4/zephyr-7b-alpha`
32
+ - `tiiuae/falcon-7b-instruct`
33
+ - `microsoft/Orca-2-7b`
34
+ - `teknium/OpenHermes-2.5-Mistral-7B`
35
+
36
+ ## Prerequisites
37
+ - Python 3.8 or later
38
+ - [Additional prerequisites as needed]
39
+
40
+ ## Installation
41
+ 1. Clone the repository:
42
+ ```bash
43
+ git clone https://github.com/yourusername/Docs-QAchat.git
44
+ ```
45
+ 2. Navigate to the project directory:
46
+ ```bash
47
+ cd Docs-QAchat
48
+ ```
49
+ 3. Install required packages:
50
+ ```bash
51
+ pip install -r requirements.txt
52
+ ```
53
+
54
+ ## Configuration
55
+ 1. Create a `.env` file in the project root.
56
+ 2. Add the following environment variables to the `.env` file:
57
+ ```
58
+ HUGGINGFACEHUB_API_TOKEN=""
59
+ AWS_S3_LOCATION=""
60
+ AWS_S3_FILE=""
61
+ VS_DESTINATION=""
62
+ ```
63
+
64
+ ## Usage
65
+ Start the application by running:
66
+ ```bash
67
+ python app.py
68
+ ```
69
+ [Include additional usage instructions and examples]
70
+
71
+ ## Contributing
72
+ Contributions to Docs QAchat are welcome. [Include contribution guidelines]
73
+
74
+ ## Support
75
+ For support, contact [Support Contact Information].
76
+
77
+ ## Authors and Acknowledgement
78
+ - [Name]
79
+ - Acknowledgements to the contributors of the used models and technologies.
80
+
81
+ ## License
82
+ This project is licensed under the [License] - see the LICENSE file for details.
app.py CHANGED
@@ -7,6 +7,7 @@ import boto3
7
  from botocore import UNSIGNED
8
  from botocore.client import Config
9
  # access .env file
 
10
  from dotenv import load_dotenv
11
  #from bs4 import BeautifulSoup
12
  # HF libraries
@@ -24,9 +25,12 @@ from langchain.memory import ConversationBufferMemory
24
  #import logging
25
  import zipfile
26
 
27
- # load HF Token
28
  config = load_dotenv(".env")
29
-
 
 
 
30
 
31
  model_id = HuggingFaceHub(repo_id="HuggingFaceH4/zephyr-7b-beta", model_kwargs={
32
  "temperature":0.1,
@@ -43,8 +47,8 @@ embeddings = HuggingFaceHubEmbeddings(repo_id=model_name)
43
  s3 = boto3.client('s3', config=Config(signature_version=UNSIGNED))
44
 
45
  ## Chroma DB
46
- s3.download_file('rad-rag-demos', 'vectorstores/chroma.sqlite3', './chroma_db/chroma.sqlite3')
47
- db = Chroma(persist_directory="./chroma_db", embedding_function=embeddings)
48
  db.get()
49
 
50
  ## FAISS DB
 
7
  from botocore import UNSIGNED
8
  from botocore.client import Config
9
  # access .env file
10
+ import os
11
  from dotenv import load_dotenv
12
  #from bs4 import BeautifulSoup
13
  # HF libraries
 
25
  #import logging
26
  import zipfile
27
 
28
+ # load .env variables
29
  config = load_dotenv(".env")
30
+ HUGGINGFACEHUB_API_TOKEN=os.getenv('HUGGINGFACEHUB_API_TOKEN')
31
+ AWS_S3_LOCATION=os.getenv('AWS_S3_LOCATION')
32
+ AWS_S3_FILE=os.getenv('AWS_S3_FILE')
33
+ VS_DESTINATION=os.getenv('VS_DESTINATION')
34
 
35
  model_id = HuggingFaceHub(repo_id="HuggingFaceH4/zephyr-7b-beta", model_kwargs={
36
  "temperature":0.1,
 
47
  s3 = boto3.client('s3', config=Config(signature_version=UNSIGNED))
48
 
49
  ## Chroma DB
50
+ s3.download_file(AWS_S3_LOCATION, AWS_S3_FILE, VS_DESTINATION)
51
+ db = Chroma(persist_directory="./vectorstore", embedding_function=embeddings)
52
  db.get()
53
 
54
  ## FAISS DB