Can we integrate this with langchain , so that we can feed entire pdf or large file to the model as a context ask questions to get the answer from that document?

#28
by sudsmr - opened

langchain and vectordb for storing pdf as embeddings.
searching using model on the entire pdf to get the correct answer.

Databricks org

Of course, we are using it with langchain already and it works well. You load this model as a HF Pipeline, and use langchain's HuggingFacePipeline wrapper to plug that in as the llm= arg to a chain. You can customize the prompt too.

Hello, if possible can you lead me to the Gradio app where I can upload the PDFs and then chat with the PDFs? I am building it with langchain, the backend is ready with this dolly-v2 but I am not sure how to integrate the components with Gradio. Please share if you have the app.

Thanks! πŸ™πŸ™

@srowen @sudsmr

Databricks org

Please see the updated model card for examples of using with LangChain. I just updated the pipeline code today and added new examples of usage.

matthayes changed discussion status to closed

how to about multiple large pdf files?

Databricks org

Langchain has some utilities for reading text in PDFs as part of the vector DB support, and chunking large docs too. This is more of a langchain question

Can you give sample python code that uses langchain to read PDF and utilize Dolly 2.0 to answer questions ?

Thank you.

Sign up or log in to comment