# Description: In the app a query is initially “related” to chunks of text from the documents that are in my embedding model pkl file (using cosine similarity). Those results (i.e., the chunks most closely associated with the query) are then fed into a new ChatGPT prompt as “context” along with the query. ChatGPT then uses context + query to respond. Thus the chain is (a) query my model, and (b) use those results to query ChatGPT. # Embedding: I created the embedding file outside of huggingface, though you can do it here to: https://huggingface.co/blog/getting-started-with-embeddings With the embedding you determine the appropriate size that you want to divide the text into chunks (e.g., 500, 800, 1000 token chunks). The size of chunks is therefore an important variable since if you have large chunks (e.g., 2000 tokens) you pass larger context blocks to ChatGPT, but also use up the limited number of tokens available. For ChatGPT the default length is fixed at 2048 tokens, while the maximum can be set at 4096 tokens. The point being, consider what strategy you want to use for chunk sizing for each project you are working on. More on tokens limits can be found here: https://medium.com/@russkohn/mastering-ai-token-limits-and-memory-ce920630349a # Templates for App: https://huggingface.co/spaces/anzorq/chatgpt-demo https://blog.devgenius.io/chat-with-document-s-using-openai-chatgpt-api-and-text-embedding-6a0ce3dc8bc8 https://github.com/hwchase17/chat-langchain-notion/blob/master/ingest_data.py # On Using LangChain: https://www.youtube.com/watch?v=2xxziIWmaSA&list=PLqZXAkvF1bPNQER9mLmDbntNfSpzdDIU5&index=3 https://github.com/gkamradt/langchain-tutorials/blob/main/LangChain%20Cookbook.ipynb