Can we integrate this with langchain , so that we can feed entire pdf or large file to the model as a context ask questions to get the answer from that document?

#28

by sudsmr - opened Apr 14, 2023

Apr 14, 2023

langchain and vectordb for storing pdf as embeddings.
searching using model on the entire pdf to get the correct answer.

srowen

Databricks org Apr 14, 2023

Of course, we are using it with langchain already and it works well. You load this model as a HF Pipeline, and use langchain's HuggingFacePipeline wrapper to plug that in as the llm= arg to a chain. You can customize the prompt too.

AayushShah

Apr 15, 2023

Hello, if possible can you lead me to the Gradio app where I can upload the PDFs and then chat with the PDFs? I am building it with langchain, the backend is ready with this dolly-v2 but I am not sure how to integrate the components with Gradio. Please share if you have the app.

Thanks! 🙏🙏

@srowen @sudsmr

matthayes

Databricks org Apr 17, 2023

Please see the updated model card for examples of using with LangChain. I just updated the pipeline code today and added new examples of usage.

matthayes changed discussion status to closed Apr 17, 2023

TaimoorNeutron

Apr 28, 2023

how to about multiple large pdf files?

srowen

Databricks org Apr 28, 2023

Langchain has some utilities for reading text in PDFs as part of the vector DB support, and chunking large docs too. This is more of a langchain question

gbdevacct

May 5, 2023

Can you give sample python code that uses langchain to read PDF and utilize Dolly 2.0 to answer questions ?

srowen

Databricks org May 5, 2023

https://www.dbdemos.ai/demo.html?demoName=llm-dolly-chatbot but swap in reading PDFs to create the document store https://python.langchain.com/en/latest/modules/indexes/document_loaders/examples/pdf.html

gbdevacct

May 5, 2023

Thank you.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment