Question Answering model using dolly

#59
by Iamexperimenting - opened

Hi, can anyone help me on building question answering model using dolly? Or any other open source LLM?

I have my data in pdf, txt format (unstructured format) I want to build conversational question answering model. Could you please provide me any relevant article?

Like, how to build conversational question answering model using open source LLM from my data.

Databricks org

Sure, this is exactly what langchain is good for. It has question-answering chains that let you build this around a vector DB of text and an LLM.
We have an example that uses Dolly, though you could use any text-gen LLM.
https://www.dbdemos.ai/demo.html?demoName=llm-dolly-chatbot

@srowen thanks for the example. I followed document and prepared "question answer model with my data". I have a doubt, how to do I host it has an API? because I don't know. can you please provide me example?

Databricks org

Many ways I'm sure. I'm familiar with MLflow, which helps track and then serve models. MLflow natively supports langchain and transformer models/chains, but, the support isn't quite enough here because you need to bundle a vector DB like Chroma with your model too in this case, if you're not running it separately. That's fairly straightforward, just put the inference code into a class that generates predictions, and can load the vector DB from an artifact that was logged with the model. I don't have a worked example, but it's fairly easy to piece together. Then mlflow can serve it for you. Databricks has built-in serving built around MLflow too, but you don't have to use that of course.

thanks very much @srowen . I have few more general questions to clarify with you.

  1. currently, I use my data(20 files) to create embedding from HuggingFaceEmbeddings. Even if I have 2 millions files do I need to follow the same steps like 1.create embedding from HuggingFaceEmbeddings, 2. do similarity test, and 3. pass it to model?
  2. At what stage I need to retrain the LLM?
  3. is it possible to retrain the LLM with my own data?
  4. currently, your notebook show chromadb as vector db, In case if I want to move it production how do I host it? where do I store all my data(embeddings)? do I need to store all embedding in any database, if yes, could you please recommend any?
  5. how do I evaluated dolly LLM with my data?
  6. currently, I noticed dolly model with my data gives one wrong answer. so, how do I correct the model? if it is other model like text classification I would correct the label and retrain the model with corrected label. how do I do it here?
Databricks org

If you're following the langchain question-answering chain pattern, you do need to create a vector DB with your text embedded into it. But langchain is doing much of the rest. You've tried the example? you can see what your code has to do.

This pattern does not involve fine-tuning an LLM at all.

You can simply bundle the chroma DB files and try to deploy them with your model. It's fixed, won't change, but that's simple. Otherwise you set up a stand-alone vector DB service (not Chroma) and call that.

Hard to say how to evaluate it. Depends on what you are doing.

You can't correct models directly. You can improve training data, maybe.

thanks for your response again, @srowen .

Hard to say how to evaluate it. Depends on what you are doing. - I'm creating question answer model using open-source LLM from my unstructured text data

I forgot to ask one question. Do i need to use torch.manual_seed(0) for reproducibility(to get same answer everytime?)

Databricks org

I would set do_sample=False for generation instead.

srowen changed discussion status to closed

Follow up question, so as a noob here I wanna to my best understanding "fine tune" ie I wanna teach the model a HUGH database of data and ask it questions to which it responds quickly

Databricks org

I would not fine-tune for knowledge retrieval over a large amount of data. Store the text in a vector store, retrieve relevant text at runtime and send to the LLM. This is what the example here does: https://www.dbdemos.ai/demo.html?demoName=llm-dolly-chatbot

Hi Mr.Srowen need your help. I have been trying to run the train_dolly file in databricks atmosphere. In command number 9 where deepspeed is being used. The error being displayed is that GPU resources are not being used. Can you guide me how I may make use of it using the databricks platform. Your help would be much appreciated. Below provided is the error
File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-084abc89-9a56-4220-a941-3e177a718272/bin/deepspeed", line 6, in
main()
File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-084abc89-9a56-4220-a941-3e177a718272/lib/python3.10/site-packages/deepspeed/launcher/runner.py", line 411, in main
raise RuntimeError("Unable to proceed, no GPU resources available")
RuntimeError: Unable to proceed, no GPU resources available

Databricks org

Looks like you did not run this on a GPU instance.

Looks like you did not run this on a GPU instance.

Really appreciate your reply , Could you further help me by letting me know how I may make use of the GPU instance.( I'm New to the interface databricks)

@Iamexperimenting is your code public? Would really like to take a look. I am trying to build something similar.

Sign up or log in to comment