NealCaren commited on
Commit
6e052a3
1 Parent(s): 0cd5c6b

Revised explanation

Browse files
Files changed (1) hide show
  1. app.py +15 -6
app.py CHANGED
@@ -103,16 +103,25 @@ Enter your question in the grey box below and click "Ask the textbook." It can t
103
  '''
104
 
105
  outro_text = '''
106
- **Caveats:** Like all apps that employ large language models, this one has the possibility for bias and confabulation.
107
 
108
  **Behind the Scenes**
109
 
110
- This app uses sentence embeddings and a large language model to craft the response. Behind the scenes, it involves the following steps:
111
 
112
- 1. Each page from the textbook (or segment of the article if it's long) is converted into a fixed-length vector representation using OpenAI's text-embedding-ada-002 model. These representations are stored in a dataframe.
113
- 2. Your question is embedded using the same text-embedding-ada-002 model to convert it into a fixed-length vector representation.
114
- 3. To find the most relevant articles to your question, cosine similarity is calculated between the query vector and all the page vectors. The pages with the highest cosine similarity are retrieved as the top matches.
115
- 5. All of the relevant texts (from Step 3), along with the original search query, are passed to OpenAI's ChatGPT 3.5 model with specific instructions to answer the question using the supplied texts.
 
 
 
 
 
 
 
 
 
116
 
117
  '''
118
 
 
103
  '''
104
 
105
  outro_text = '''
106
+ **Caveats:** Like all apps that employ large language models, this one has the possiblitiy for bias and confabulation.
107
 
108
  **Behind the Scenes**
109
 
110
+ This app uses a large language model (ChatGPT 3.5) and sentence embeddings (text-embedding-ada-002) to craft the response using what's called a retrieval-augmented generation process. Behind the scenes, it involves the following steps:
111
 
112
+ 1. Each textbook page is broken down into small chunks of text.
113
+ 2. A machine learning system converts each chunk of text into a mathematical representation called a vector. All these vectors get saved in a table.
114
+ 3. ChatGPT is used to generate a sample answer to the question.
115
+ 4. The sample answer is converted into a vector using the same method.
116
+ 5. The vector for the sample answer is compared to all the vectors for the textbook chunks. The chunks whose vectors are most like the sample answer vector are identified. These chunks are likely to be relevant to answering the question.
117
+ 6. The original question, along with the relevant textbook chunks that were found, is given to ChatGPT. ChatGPT is instructed to read the textbook chunks first and use them to help answer the question in its own words.
118
+
119
+ In summary:
120
+ - Text is converted to math vectors.
121
+ - Textbook vectors similar to a sample answer vector are found.
122
+ - The questions, similar textbook chunks, are given to ChatGPT to answer using those chunks.
123
+
124
+ This process allows the AI system to search the textbook, find relevant information, and use it to generate a better answer to the question!
125
 
126
  '''
127