Getting weird (same) response everytime through Mistral7B

#157

by pawankumar-108 - opened Jun 22, 2024

Discussion

pawankumar-108

Jun 22, 2024

•

edited Jun 22, 2024

Hi, I am creating a RAG chatbot using Mistral7B, Haystack nd Chainlit.

I am getting same response everytime I run it in my browser locally.

Is there something I coded wrong there?? Please help, frustrated with it 😩

attaching the app.py file, ss of chainlit UI and cmd status.

would really appreciate if anyone could plz help

app.py Code:

import chainlit as cl
from datasets import load_dataset
from haystack.document_stores import InMemoryDocumentStore
from haystack.nodes import PromptNode, PromptTemplate, AnswerParser, BM25Retriever
from haystack.pipelines import Pipeline
from haystack.utils import print_answers
import os
from dotenv import load_dotenv

load_dotenv()

from PyPDF2 import PdfReader
import docx
import os
#file_path = "\Ayurvedic Dataset"

def read_pdf(file_path):
with open(file_path, "rb") as file:
pdf_reader = PdfReader(file)
text= ""
for page_num in range(len(pdf_reader.pages)):
text+= pdf_reader.pages[page_num].extract_text()
return text

def read_word(file_path):
doc = docx.Document(file_path)
text = ""
for paragraph in doc.paragraphs:
text+= paragraph.text + "/\n"
return text

def read_txt(file_path):
with open(file_path, "r") as file:
text = file.read()
return text

def read_directory(directory):
combined_text = ""
for file_name in os.listdir(directory):
file_path = os.path.join(directory, file_name)
if file_name.endswith(".pdf"):
combined_text += read_pdf(file_path)
elif file_name.endswith(".docx"):
combined_text += read_word(file_path)
elif file_name.endswith(".txt"):
combined_text += read_txt(file_path)
return combined_text

from datasets import load_dataset

#initializing the retriever
retriever = BM25Retriever(document_store= document_store, top_k= 3)

prompt_template= PromptTemplate(
prompt= """
Answer the provided question based solely on the provided document. If the document does not
have any matched answer. Say them that "No answer found" "
Make sure that the answer is not too long, use a conversational tone as a friend and ask questions as well and based on the
more conversation, answer their questions. And don't include whole document as answer, only specific part need to be showed to the person using you.
Documents: {join(documents)}
Question: {query}
Answer:
""" ,
output_parser = AnswerParser()

)

HF_TOKEN = os.environ.get("HF_TOKEN")

prompt_node = PromptNode(
model_name_or_path = "mistralai/Mistral-7B-Instruct-v0.3",
api_key = HF_TOKEN ,
default_prompt_template = prompt_template ,

)

#creating the pipeline

generative_pipeline = Pipeline()
generative_pipeline.add_node(component= retriever, name="retriever", inputs=["Query"])
generative_pipeline.add_node(component= prompt_node, name= "prompt_node", inputs=["retriever"])

from chainlit import Message
@cl .on_message
async def main(message: cl.Message):
question = message.content # Extract the content of the message
prediction = generative_pipeline.run(query=question)
answers = prediction.get('answers', [])

if answers:
    formatted_response = "\n".join([answer.answer for answer in answers])
else:
    formatted_response = "I'm sorry, I couldn't find an answer to your question."

await cl.Message(
    content=formatted_response
).send()

eddykim129

Jun 25, 2024

Having the exact same issue.... I think there is something wrong with the model.

pandora-s

Mistral AI_ org Jun 25, 2024

Hi there, this is the base model, not an instruct model. You should use https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3 instead as it was fine tuned for conversations/instructions.

pandora-s

Mistral AI_ org Jun 25, 2024

I see in your code you do indeed use the proper model thats great, im not used to chainlit but, could you put ur code in a code block to make it easier to read?

pawankumar-108

Jun 25, 2024

I see in your code you do indeed use the proper model thats great, im not used to chainlit but, could you put ur code in a code block to make it easier to read?

Unable to upload it in code format here, so uploaded in this github form: https://github.com/pawan-kumar-108/temp
Plz check.

Thank you.

pandora-s

Mistral AI_ org Jun 25, 2024

I'm checking ur screenshot of your error and seems like the issue is just that the context limit that is set by default seems to be 1k, and your prompt is being cut of- I'm not sure how one would increase it on haystack, maybe an argument of the pipeline?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment