Spaces:

seanpedrickcase
/

Light-PDF-Web-QA-Chatbot

Running

App Files Files Community

Sean-Case commited on Oct 9, 2023

Commit

2e536f9

•

1 Parent(s): 84b25ff

Added working like buttons. Added model choice. Modified requirements txt

Browse files

Files changed (6) hide show

Generation speed GPU test.txt +51 -0
Link to images.txt +1 -1
app.py +76 -19
chatfuncs/chatfuncs.py +266 -345
requirements.txt +3 -9
thumbs_up_data.csv +36 -0

Generation speed GPU test.txt ADDED Viewed

	@@ -0,0 +1,51 @@

+With 5 gpu layers, batch size 8
+Num of generated tokens: 113
+Time for complete generation: 115.42684650421143s
+Tokens per secound: 0.9789750255013432
+Time per token: 1021.4765177363843ms
+With 5 gpu layers, batch size 512
+Num of generated tokens: 102
+Time for complete generation: 40.369266986846924s
+Tokens per secound: 2.5266745624396285
+Time per token: 395.77712732202866ms
+With 6 gpu layers -
+Num of generated tokens: 113
+Time for complete generation: 46.37785983085632s
+Tokens per secound: 2.4365074285902764
+Time per token: 410.42353832616215ms
+With 6 gpu layers, batch size 1024 -
+Five pillars Q:
+Num of generated tokens: 102
+Time for complete generation: 41.85241961479187s
+Tokens per secound: 2.4371350793766346
+Time per token: 410.31783936070457ms
+With 8 threads
+Num of generated tokens: 102
+Time for complete generation: 40.64410996437073s
+Tokens per secound: 2.5095887224351774
+Time per token: 398.4716663173601ms
+Vision statement Q:
+Num of generated tokens: 84
+Time for complete generation: 35.57932233810425s
+Tokens per secound: 2.360921863597128
+Time per token: 423.5633611679077ms
+Commitments Q:
+Num of generated tokens: 50
+Time for complete generation: 23.73319172859192s
+Tokens per secound: 2.106754142965266
+Time per token: 474.6638345718384ms
+Outcomes Q
+Num of generated tokens: 167
+Time for complete generation: 52.302518367767334s
+Tokens per secound: 3.1929628861412094
+Time per token: 313.1887327411217ms

Link to images.txt CHANGED Viewed

@@ -1,4 +1,4 @@
-Robot emoji: https://upload.wikimedia.org/wikipedia/commons/thumb/5/50/Fluent_Emoji_high_contrast_1f916.svg/32px-Fluent_Emoji_high_contrast_1f916.svg.png
 Bing smile emoji: https://www.bing.com/images/create/a-black-and-white-emoji-with-a-simple-smile2c-black/6523d2c320df409581e85bec80ef3ba8?id=KTdVbixG8oRqR9BzF6AblQ%3d%3d&view=detailv2&idpp=genimg&idpclose=1&FORM=SYDBIC


1	+ Robot emoji: https://commons.wikimedia.org/wiki/File:Fluent_Emoji_high_contrast_1f916.svg
2
3	Bing smile emoji: https://www.bing.com/images/create/a-black-and-white-emoji-with-a-simple-smile2c-black/6523d2c320df409581e85bec80ef3ba8?id=KTdVbixG8oRqR9BzF6AblQ%3d%3d&view=detailv2&idpp=genimg&idpclose=1&FORM=SYDBIC
4

app.py CHANGED Viewed

@@ -5,7 +5,13 @@ import os
 from typing import TypeVar
 from langchain.embeddings import HuggingFaceEmbeddings, HuggingFaceInstructEmbeddings
 from langchain.vectorstores import FAISS
 #PandasDataFrame: type[pd.core.frame.DataFrame]
 PandasDataFrame = TypeVar('pd.core.frame.DataFrame')
@@ -16,7 +22,8 @@ PandasDataFrame = TypeVar('pd.core.frame.DataFrame')
 #from chatfuncs.chatfuncs import *
 import chatfuncs.ingest as ing
-##  Load preset embeddings and vectorstore
 embeddings_name = "thenlper/gte-base"
@@ -58,6 +65,55 @@ import chatfuncs.chatfuncs as chatf
 chatf.embeddings = load_embeddings(embeddings_name)
 chatf.vectorstore = get_faiss_store(faiss_vstore_folder="faiss_embedding",embeddings=globals()["embeddings"])
 def docs_to_faiss_save(docs_out:PandasDataFrame, embeddings=embeddings):
     print(f"> Total split documents: {len(docs_out)}")
@@ -75,14 +131,6 @@ def docs_to_faiss_save(docs_out:PandasDataFrame, embeddings=embeddings):
  # Gradio chat
-import gradio as gr
-def vote(data: gr.LikeData):
-    if data.liked:
-        print("You upvoted this response: " + data.value)
-    else:
-        print("You downvoted this response: " + data.value)
 block = gr.Blocks(theme = gr.themes.Base())#css=".gradio-container {background-color: black}")
 with block:
@@ -90,22 +138,26 @@ with block:
     ingest_metadata = gr.State()
     ingest_docs = gr.State()
     embeddings_state = gr.State(globals()["embeddings"])
     vectorstore_state = gr.State(globals()["vectorstore"])
     chat_history_state = gr.State()
     instruction_prompt_out = gr.State()
     gr.Markdown("<h1><center>Lightweight PDF / web page QA bot</center></h1>")
-    gr.Markdown("Chat with a document (alpha). This is a small model, that can only answer specific questions that are answered in the text. It cannot give overall impressions of, or summarise the document. By default the Lambeth Borough Plan '[Lambeth 2030 : Our Future, Our Lambeth](https://www.lambeth.gov.uk/better-fairer-lambeth/projects/lambeth-2030-our-future-our-lambeth)' is loaded. If you want to talk about another document or web page, please select from the second tab. If switching topic, please click the 'Clear chat' button.\n\nWarnings: This is a public app. Please ensure that the document you upload is not sensitive is any way as other users may see it! Also, please note that LLM chatbots may give incomplete or incorrect information, so please use with care.")
     current_source = gr.Textbox(label="Current data source that is loaded into the app", value="Lambeth_2030-Our_Future_Our_Lambeth.pdf")
     with gr.Tab("Chatbot"):
         with gr.Row():
-            chat_height = 550
             chatbot = gr.Chatbot(height=chat_height, avatar_images=('user.jfif', 'bot.jpg'),bubble_full_width = False)
             sources = gr.HTML(value = "Source paragraphs where I looked for answers will appear here", height=chat_height)
@@ -143,12 +195,17 @@ with block:
         ingest_embed_out = gr.Textbox(label="File/webpage preparation progress")
     gr.HTML(
-        "<center>Powered by Orca Mini and Langchain</a></center>"
     )
     examples_set.change(fn=chatf.update_message, inputs=[examples_set], outputs=[message])
     # Load in a pdf
     load_pdf_click = load_pdf.click(ing.parse_file, inputs=[in_pdf], outputs=[ingest_text, current_source]).\
              then(ing.text_to_docs, inputs=[ingest_text], outputs=[ingest_docs]).\
@@ -164,25 +221,25 @@ with block:
     # Load in a webpage
     # Click/enter to send message action
-    response_click = submit.click(chatf.get_history_sources_final_input_prompt, inputs=[message, chat_history_state, current_topic, vectorstore_state, embeddings_state], outputs=[chat_history_state, sources, instruction_prompt_out], queue=False, api_name="retrieval").\
                 then(chatf.turn_off_interactivity, inputs=[message, chatbot], outputs=[message, chatbot], queue=False).\
-                then(chatf.produce_streaming_answer_chatbot_ctrans, inputs=[chatbot, instruction_prompt_out], outputs=chatbot)
     response_click.then(chatf.highlight_found_text, [chatbot, sources], [sources]).\
                 then(chatf.add_inputs_answer_to_history,[message, chatbot, current_topic], [chat_history_state, current_topic]).\
-                then(lambda: gr.update(interactive=True), None, [message], queue=False)
-    response_enter = message.submit(chatf.get_history_sources_final_input_prompt, inputs=[message, chat_history_state, current_topic, vectorstore_state, embeddings_state], outputs=[chat_history_state, sources, instruction_prompt_out], queue=False).\
                 then(chatf.turn_off_interactivity, inputs=[message, chatbot], outputs=[message, chatbot], queue=False).\
-                then(chatf.produce_streaming_answer_chatbot_ctrans, [chatbot, instruction_prompt_out], chatbot)
     response_enter.then(chatf.highlight_found_text, [chatbot, sources], [sources]).\
                 then(chatf.add_inputs_answer_to_history,[message, chatbot, current_topic], [chat_history_state, current_topic]).\
-                then(lambda: gr.update(interactive=True), None, [message], queue=False)
     # Clear box
     clear.click(chatf.clear_chat, inputs=[chat_history_state, sources, message, current_topic], outputs=[chat_history_state, sources, message, current_topic])
     clear.click(lambda: None, None, chatbot, queue=False)
-    chatbot.like(vote, None, None)
 block.queue(concurrency_count=1).launch(debug=True)
 # -

 from typing import TypeVar
 from langchain.embeddings import HuggingFaceEmbeddings, HuggingFaceInstructEmbeddings
 from langchain.vectorstores import FAISS
+import gradio as gr
+from transformers import AutoTokenizer#, pipeline, TextIteratorStreamer
+from dataclasses import asdict, dataclass
+# Alternative model sources
+from ctransformers import AutoModelForCausalLM#, AutoTokenizer
 #PandasDataFrame: type[pd.core.frame.DataFrame]
 PandasDataFrame = TypeVar('pd.core.frame.DataFrame')
 #from chatfuncs.chatfuncs import *
 import chatfuncs.ingest as ing
+##  Load preset embeddings, vectorstore, and model
 embeddings_name = "thenlper/gte-base"
 chatf.embeddings = load_embeddings(embeddings_name)
 chatf.vectorstore = get_faiss_store(faiss_vstore_folder="faiss_embedding",embeddings=globals()["embeddings"])
+model_type = "Flan Alpaca"
+def load_model(model_type, CtransInitConfig_gpu=chatf.CtransInitConfig_gpu, CtransInitConfig_cpu=chatf.CtransInitConfig_cpu, torch_device=chatf.torch_device):
+    print("Loading model")
+    if model_type == "Orca Mini":
+        try:
+            model = AutoModelForCausalLM.from_pretrained('juanjgit/orca_mini_3B-GGUF', model_type='llama', model_file='orca-mini-3b.q4_0.gguf', **asdict(CtransInitConfig_gpu()))
+        except:
+            model = AutoModelForCausalLM.from_pretrained('juanjgit/orca_mini_3B-GGUF', model_type='llama', model_file='orca-mini-3b.q4_0.gguf', **asdict(CtransInitConfig_cpu()))
+        tokenizer = []
+    if model_type == "Flan Alpaca":
+        # Huggingface chat model
+        hf_checkpoint = 'declare-lab/flan-alpaca-large'
+        def create_hf_model(model_name):
+            from transformers import AutoModelForSeq2SeqLM,  AutoModelForCausalLM
+        #    model_id = model_name
+            if torch_device == "cuda":
+                if "flan" in model_name:
+                    model = AutoModelForSeq2SeqLM.from_pretrained(model_name, load_in_8bit=True, device_map="auto")
+                else:
+                    model = AutoModelForCausalLM.from_pretrained(model_name, load_in_8bit=True, device_map="auto")
+            else:
+                if "flan" in model_name:
+                    model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
+                else:
+                    model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True)
+            tokenizer = AutoTokenizer.from_pretrained(model_name, model_max_length = chatf.context_length)
+            return model, tokenizer, model_type
+        model, tokenizer, model_type = create_hf_model(model_name = hf_checkpoint)
+    chatf.model = model
+    chatf.tokenizer = tokenizer
+    chatf.model_type = model_type
+    print("Finished loading model: ", model_type)
+    return model_type
+load_model(model_type, chatf.CtransInitConfig_gpu, chatf.CtransInitConfig_cpu, chatf.torch_device)
 def docs_to_faiss_save(docs_out:PandasDataFrame, embeddings=embeddings):
     print(f"> Total split documents: {len(docs_out)}")
  # Gradio chat
 block = gr.Blocks(theme = gr.themes.Base())#css=".gradio-container {background-color: black}")
 with block:
     ingest_metadata = gr.State()
     ingest_docs = gr.State()
+    model_type_state = gr.State(model_type)
     embeddings_state = gr.State(globals()["embeddings"])
     vectorstore_state = gr.State(globals()["vectorstore"])
+    model_state = gr.State() # chatf.model (gives error)
+    tokenizer_state = gr.State() # chatf.tokenizer (gives error)
     chat_history_state = gr.State()
     instruction_prompt_out = gr.State()
     gr.Markdown("<h1><center>Lightweight PDF / web page QA bot</center></h1>")
+    gr.Markdown("Chat with PDF or web page documents. The default is a small model (Flan Alpaca), that can only answer specific questions that are answered in the text. It cannot give overall impressions of, or summarise the document. The alternative (Orca Mini), can reason a little better, but is much slower (See advanced tab).\n\nBy default the Lambeth Borough Plan '[Lambeth 2030 : Our Future, Our Lambeth](https://www.lambeth.gov.uk/better-fairer-lambeth/projects/lambeth-2030-our-future-our-lambeth)' is loaded. If you want to talk about another document or web page, please select from the second tab. If switching topic, please click the 'Clear chat' button.\n\nCaution: This is a public app. Likes and dislike responses will be saved to disk to improve the model. Please ensure that the document you upload is not sensitive is any way as other users may see it! Also, please note that LLM chatbots may give incomplete or incorrect information, so please use with care.")
     current_source = gr.Textbox(label="Current data source that is loaded into the app", value="Lambeth_2030-Our_Future_Our_Lambeth.pdf")
     with gr.Tab("Chatbot"):
         with gr.Row():
+            chat_height = 500
             chatbot = gr.Chatbot(height=chat_height, avatar_images=('user.jfif', 'bot.jpg'),bubble_full_width = False)
             sources = gr.HTML(value = "Source paragraphs where I looked for answers will appear here", height=chat_height)
         ingest_embed_out = gr.Textbox(label="File/webpage preparation progress")
+    with gr.Tab("Advanced features"):
+        model_choice = gr.Radio(label="Choose a chat model", value="Flan Alpaca", choices = ["Flan Alpaca", "Orca Mini"])
     gr.HTML(
+        "<center>This app is based on the models Flan Alpaca and Orca Mini. It powered by Gradio, Transformers, Ctransformers, and Langchain.</a></center>"
     )
     examples_set.change(fn=chatf.update_message, inputs=[examples_set], outputs=[message])
+    model_choice.change(fn=load_model, inputs=[model_choice], outputs = [model_type_state])
     # Load in a pdf
     load_pdf_click = load_pdf.click(ing.parse_file, inputs=[in_pdf], outputs=[ingest_text, current_source]).\
              then(ing.text_to_docs, inputs=[ingest_text], outputs=[ingest_docs]).\
     # Load in a webpage
     # Click/enter to send message action
+    response_click = submit.click(chatf.create_full_prompt, inputs=[message, chat_history_state, current_topic, vectorstore_state, embeddings_state, model_type_state], outputs=[chat_history_state, sources, instruction_prompt_out], queue=False, api_name="retrieval").\
                 then(chatf.turn_off_interactivity, inputs=[message, chatbot], outputs=[message, chatbot], queue=False).\
+                then(chatf.produce_streaming_answer_chatbot, inputs=[chatbot, instruction_prompt_out, model_type_state], outputs=chatbot)
     response_click.then(chatf.highlight_found_text, [chatbot, sources], [sources]).\
                 then(chatf.add_inputs_answer_to_history,[message, chatbot, current_topic], [chat_history_state, current_topic]).\
+                then(lambda: chatf.restore_interactivity(), None, [message], queue=False)
+    response_enter = message.submit(chatf.create_full_prompt, inputs=[message, chat_history_state, current_topic, vectorstore_state, embeddings_state, model_type_state], outputs=[chat_history_state, sources, instruction_prompt_out], queue=False).\
                 then(chatf.turn_off_interactivity, inputs=[message, chatbot], outputs=[message, chatbot], queue=False).\
+                then(chatf.produce_streaming_answer_chatbot, [chatbot, instruction_prompt_out, model_type_state], chatbot)
     response_enter.then(chatf.highlight_found_text, [chatbot, sources], [sources]).\
                 then(chatf.add_inputs_answer_to_history,[message, chatbot, current_topic], [chat_history_state, current_topic]).\
+                then(lambda: chatf.restore_interactivity(), None, [message], queue=False)
     # Clear box
     clear.click(chatf.clear_chat, inputs=[chat_history_state, sources, message, current_topic], outputs=[chat_history_state, sources, message, current_topic])
     clear.click(lambda: None, None, chatbot, queue=False)
+    chatbot.like(chatf.vote, [chat_history_state, instruction_prompt_out, model_type_state], None)
 block.queue(concurrency_count=1).launch(debug=True)
 # -

chatfuncs/chatfuncs.py CHANGED Viewed

@@ -1,13 +1,13 @@
 import re
 import datetime
 from typing import TypeVar, Dict, List, Tuple
 from itertools import compress
 import pandas as pd
 import numpy as np
 # Model packages
 import torch
-torch.cuda.empty_cache()
 from threading import Thread
 from transformers import AutoTokenizer, pipeline, TextIteratorStreamer
@@ -16,7 +16,6 @@ from ctransformers import AutoModelForCausalLM#, AutoTokenizer
 from dataclasses import asdict, dataclass
 # Langchain functions
-from langchain import PromptTemplate
 from langchain.prompts import PromptTemplate
 from langchain.vectorstores import FAISS
 from langchain.retrievers import SVMRetriever
@@ -41,26 +40,46 @@ from gensim.similarities import SparseMatrixSimilarity
 import gradio as gr
-if torch.cuda.is_available():
-    torch_device = "cuda"
-    gpu_layers = 5
-else: torch_device =  "cpu"
-print("Running on device:", torch_device)
-threads = 8#torch.get_num_threads()
-print("CPU threads:", threads)
 PandasDataFrame = TypeVar('pd.core.frame.DataFrame')
 embeddings = None  # global variable setup
 vectorstore = None # global variable setup
 max_memory_length = 0 # How long should the memory of the conversation last?
 full_text = "" # Define dummy source text (full text) just to enable highlight function to load
-ctrans_llm = [] # Define empty list to hold CTrans LLMs for functions to run
 temperature: float = 0.1
 top_k: int = 3
 top_p: float = 1
@@ -68,24 +87,26 @@ repetition_penalty: float = 1.05
 flan_alpaca_repetition_penalty: float = 1.3
 last_n_tokens: int = 64
 max_new_tokens: int = 125
-#seed: int = 42
 reset: bool = False
 stream: bool = True
 threads: int = threads
-batch_size:int = 512
 context_length:int = 4096
-gpu_layers:int = 0#5#gpu_layers For serving on Huggingface set to 0 as using free CPU instance
 sample = True
 @dataclass
-class GenerationConfig:
     temperature: float = temperature
     top_k: int = top_k
     top_p: float = top_p
     repetition_penalty: float = repetition_penalty
     last_n_tokens: int = last_n_tokens
     max_new_tokens: int = max_new_tokens
-    #seed: int = 42
     reset: bool = reset
     stream: bool = stream
     threads: int = threads
@@ -94,60 +115,33 @@ class GenerationConfig:
     gpu_layers:int = gpu_layers
     #stop: list[str] = field(default_factory=lambda: [stop_string])
-## Highlight text constants
-hlt_chunk_size = 15
-hlt_strat = [" ", ".", "!", "?", ":", "\n\n", "\n", ","]
-hlt_overlap = 0
-## Initialise NER model ##
-ner_model = SpanMarkerModel.from_pretrained("tomaarsen/span-marker-mbert-base-multinerd")
-## Initialise keyword model ##
-# Used to pull out keywords from chat history to add to user queries behind the scenes
-kw_model = pipeline("feature-extraction", model="sentence-transformers/all-MiniLM-L6-v2")
-## Set model type ##
-model_type = "ctrans"
-## Chat models ##
-if model_type == "ctrans":
-    ctrans_llm = AutoModelForCausalLM.from_pretrained('juanjgit/orca_mini_3B-GGUF', model_type='llama', model_file='orca-mini-3b.q4_0.gguf', **asdict(GenerationConfig()))
-    #ctrans_llm = AutoModelForCausalLM.from_pretrained('TheBloke/Mistral-7B-OpenOrca-GGUF', model_type='mistral', model_file='mistral-7b-openorca.Q4_K_M.gguf', **asdict(GenerationConfig()))
-    #ctrans_llm = AutoModelForCausalLM.from_pretrained('TheBloke/Mistral-7B-OpenOrca-GGUF', model_type='mistral', model_file='mistral-7b-openorca.Q2_K.gguf', **asdict(GenerationConfig()))
-if model_type == "hf":
-    # Huggingface chat model
-    #hf_checkpoint = 'jphme/phi-1_5_Wizard_Vicuna_uncensored'
-    hf_checkpoint = 'declare-lab/flan-alpaca-large'
-    def create_hf_model(model_name):
-        from transformers import AutoModelForSeq2SeqLM,  AutoModelForCausalLM
-    #    model_id = model_name
-        if torch_device == "cuda":
-            if "flan" in model_name:
-                model = AutoModelForSeq2SeqLM.from_pretrained(model_name, load_in_8bit=True, device_map="auto")
-            elif "mpt" in model_name:
-                model = AutoModelForCausalLM.from_pretrained(model_name, load_in_8bit=True, device_map="auto", trust_remote_code=True)
-            else:
-                model = AutoModelForCausalLM.from_pretrained(model_name, load_in_8bit=True, device_map="auto")
-        else:
-            if "flan" in model_name:
-                model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
-            elif "mpt" in model_name:
-                model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True)
-            else:
-                model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True)
-        tokenizer = AutoTokenizer.from_pretrained(model_name, model_max_length = 2048)
-        return model, tokenizer, torch_device
-    model, tokenizer, torch_device = create_hf_model(model_name = hf_checkpoint)
 # Vectorstore funcs
@@ -179,9 +173,9 @@ def docs_to_faiss_save(docs_out:PandasDataFrame, embeddings=embeddings):
     return out_message
-# # Prompt functions
-def create_prompt_templates():
     #EXAMPLE_PROMPT = PromptTemplate(
     #    template="\nCONTENT:\n\n{page_content}\n\nSOURCE: {source}\n\n",
@@ -193,7 +187,6 @@ def create_prompt_templates():
         input_variables=["page_content"]
     )
 # The main prompt:
     instruction_prompt_template_alpaca_quote = """### Instruction:
@@ -205,31 +198,168 @@ def create_prompt_templates():
     Response:"""
     instruction_prompt_template_orca = """
     ### System:
     You are an AI assistant that follows instruction extremely well. Help as much as you can.
     ### User:
-    Answer the QUESTION using information from the following CONTENT.
     CONTENT: {summaries}
     QUESTION: {question}
     ### Response:"""
     instruction_prompt_mistral_orca = """<|im_start|>system\n
-You are an AI assistant that follows instruction extremely well. Help as much as you can.
-<|im_start|>user\n
-Answer the QUESTION using information from the following CONTENT.
-CONTENT: {summaries}
-QUESTION: {question}\n
-<|im_end|>"""
-    INSTRUCTION_PROMPT=PromptTemplate(template=instruction_prompt_template_orca, input_variables=['question', 'summaries'])
-    return INSTRUCTION_PROMPT, CONTENT_PROMPT
 def adapt_q_from_chat_history(question, chat_history, extracted_memory, keyword_model=""):#keyword_model): # new_question_keywords,
@@ -485,12 +615,6 @@ def get_expanded_passages(vectorstore, docs, width):
     # Step 1: Filter vstore_docs
     vstore_docs = get_docs_from_vstore(vectorstore)
-    print("Inside get_expanded_passages")
-    print("Docs:", docs)
-    print("Type of Docs:", type(docs))
-    print("Type of first element in Docs:", type(docs[0]))
-    print("Length of first tuple in Docs:", len(docs[0]))
     doc_sources = {doc.metadata['source'] for doc, _ in docs}
     vstore_docs = [(k, v) for k, v in vstore_docs if v.metadata.get('source') in doc_sources]
@@ -516,162 +640,6 @@ def get_expanded_passages(vectorstore, docs, width):
     return expanded_docs, doc_df
-def create_final_prompt(inputs: Dict[str, str], instruction_prompt, content_prompt, extracted_memory, vectorstore, embeddings): # ,
-        question =  inputs["question"]
-        chat_history = inputs["chat_history"]
-        new_question_kworded = adapt_q_from_chat_history(question, chat_history, extracted_memory) # new_question_keywords,
-        #print("The question passed to the vector search is:")
-        #print(new_question_kworded)
-        docs_keep_as_doc, doc_df, docs_keep_out = hybrid_retrieval(new_question_kworded, vectorstore, embeddings, k_val = 5, out_passages = 2,
-                                                                          vec_score_cut_off = 1, vec_weight = 1, bm25_weight = 1, svm_weight = 1)#,
-                                                                          #vectorstore=globals()["vectorstore"], embeddings=globals()["embeddings"])
-        # Expand the found passages to the neighbouring context
-        docs_keep_as_doc, doc_df = get_expanded_passages(vectorstore, docs_keep_out, width=1)
-        if docs_keep_as_doc == []:
-            {"answer": "I'm sorry, I couldn't find a relevant answer to this question.", "sources":"I'm sorry, I couldn't find a relevant source for this question."}
-        #new_inputs = inputs.copy()
-        #new_inputs["question"] = new_question
-        #new_inputs["chat_history"] = chat_history_str
-        #print(docs_url)
-        #print(doc_df['metadata'])
-        # Build up sources content to add to user display
-        doc_df['meta_clean'] = [f"<b>{'  '.join(f'{k}: {v}' for k, v in d.items() if k != 'page_section')}</b>" for d in doc_df['metadata']]
-        doc_df['content_meta'] = doc_df['meta_clean'].astype(str) + ".<br><br>" + doc_df['page_content'].astype(str)
-        modified_page_content = [f" SOURCE {i+1} - {word}" for i, word in enumerate(doc_df['page_content'])]
-        docs_content_string = ''.join(modified_page_content)
-        #docs_content_string = '<br><br>\n\n SOURCE '.join(doc_df['page_content'])#.replace("  "," ")#.strip()
-        sources_docs_content_string = '<br><br>'.join(doc_df['content_meta'])#.replace("  "," ")#.strip()
-        #sources_docs_content_tup = [(sources_docs_content,None)]
-        #print("The draft instruction prompt is:")
-        #print(instruction_prompt)
-        instruction_prompt_out = instruction_prompt.format(question=new_question_kworded, summaries=docs_content_string)
-        #print("The final instruction prompt:")
-        #print(instruction_prompt_out)
-        print('Final prompt is: ')
-        print(instruction_prompt_out)
-        return instruction_prompt_out, sources_docs_content_string, new_question_kworded
-def get_history_sources_final_input_prompt(user_input, history, extracted_memory, vectorstore, embeddings):#):
-    #if chain_agent is None:
-    #    history.append((user_input, "Please click the button to submit the Huggingface API key before using the chatbot (top right)"))
-    #    return history, history, "", ""
-    print("\n==== date/time: " + str(datetime.datetime.now()) + " ====")
-    print("User input: " + user_input)
-    history = history or []
-    # Create instruction prompt
-    instruction_prompt, content_prompt = create_prompt_templates()
-    instruction_prompt_out, docs_content_string, new_question_kworded =\
-                create_final_prompt({"question": user_input, "chat_history": history}, #vectorstore,
-                                    instruction_prompt, content_prompt, extracted_memory, vectorstore, embeddings)
-    history.append(user_input)
-    print("Output history is:")
-    print(history)
-    #print("The output prompt is:")
-    #print(instruction_prompt_out)
-    return history, docs_content_string, instruction_prompt_out
-def highlight_found_text_single(search_text:str, full_text:str, hlt_chunk_size:int=hlt_chunk_size, hlt_strat:List=hlt_strat, hlt_overlap:int=hlt_overlap) -> str:
-    """
-    Highlights occurrences of search_text within full_text.
-    Parameters:
-    - search_text (str): The text to be searched for within full_text.
-    - full_text (str): The text within which search_text occurrences will be highlighted.
-    Returns:
-    - str: A string with occurrences of search_text highlighted.
-    Example:
-    >>> highlight_found_text("world", "Hello, world! This is a test. Another world awaits.")
-    'Hello, <mark style="color:black;">world</mark>! This is a test. Another world awaits.'
-    """
-    def extract_text_from_input(text,i=0):
-        if isinstance(text, str):
-            return text.replace("  ", " ").strip()#.replace("\r", " ").replace("\n", " ")
-        elif isinstance(text, list):
-            return text[i][0].replace("  ", " ").strip()#.replace("\r", " ").replace("\n", " ")
-        else:
-            return ""
-    def extract_search_text_from_input(text):
-        if isinstance(text, str):
-            return text.replace("  ", " ").strip()#.replace("\r", " ").replace("\n", " ").replace("  ", " ").strip()
-        elif isinstance(text, list):
-            return text[-1][1].replace("  ", " ").strip()#.replace("\r", " ").replace("\n", " ").replace("  ", " ").strip()
-        else:
-            return ""
-    full_text = extract_text_from_input(full_text)
-    search_text = extract_search_text_from_input(search_text)
-    text_splitter = RecursiveCharacterTextSplitter(
-        chunk_size=hlt_chunk_size,
-        separators=hlt_strat,
-        chunk_overlap=hlt_overlap,
-    )
-    sections = text_splitter.split_text(search_text)
-    #print(sections)
-    found_positions = {}
-    for x in sections:
-        text_start_pos = full_text.find(x)
-        if text_start_pos != -1:
-            found_positions[text_start_pos] = text_start_pos + len(x)
-    # Combine overlapping or adjacent positions
-    sorted_starts = sorted(found_positions.keys())
-    combined_positions = []
-    if sorted_starts:
-        current_start, current_end = sorted_starts[0], found_positions[sorted_starts[0]]
-        for start in sorted_starts[1:]:
-            if start <= (current_end + 1):
-                current_end = max(current_end, found_positions[start])
-            else:
-                combined_positions.append((current_start, current_end))
-                current_start, current_end = start, found_positions[start]
-        combined_positions.append((current_start, current_end))
-    # Construct pos_tokens
-    pos_tokens = []
-    prev_end = 0
-    for start, end in combined_positions:
-        pos_tokens.append(full_text[prev_end:start]) # ((full_text[prev_end:start], None))
-        pos_tokens.append('<mark style="color:black;">' + full_text[start:end] + '</mark>')# ("<mark>" + full_text[start:end] + "</mark>",'found')
-        prev_end = end
-    pos_tokens.append(full_text[prev_end:])
-    return "".join(pos_tokens)
 def highlight_found_text(search_text: str, full_text: str, hlt_chunk_size:int=hlt_chunk_size, hlt_strat:List=hlt_strat, hlt_overlap:int=hlt_overlap) -> str:
     """
     Highlights occurrences of search_text within full_text.
@@ -742,110 +710,14 @@ def highlight_found_text(search_text: str, full_text: str, hlt_chunk_size:int=hl
     pos_tokens = []
     prev_end = 0
     for start, end in combined_positions:
-        pos_tokens.append(full_text[prev_end:start])
-        pos_tokens.append('<mark style="color:black;">' + full_text[start:end] + '</mark>')
-        prev_end = end
     pos_tokens.append(full_text[prev_end:])
     return "".join(pos_tokens)
-# # Chat functions
-def produce_streaming_answer_chatbot_hf(history, full_prompt):
-    #print("The question is: ")
-    #print(full_prompt)
-    # Get the model and tokenizer, and tokenize the user text.
-    model_inputs = tokenizer(text=full_prompt, return_tensors="pt", return_attention_mask=False).to(torch_device) # return_attention_mask=False was added
-    # Start generation on a separate thread, so that we don't block the UI. The text is pulled from the streamer
-    # in the main thread. Adds timeout to the streamer to handle exceptions in the generation thread.
-    streamer = TextIteratorStreamer(tokenizer, timeout=120., skip_prompt=True, skip_special_tokens=True)
-    generate_kwargs = dict(
-        model_inputs,
-        streamer=streamer,
-        max_new_tokens=max_new_tokens,
-        do_sample=sample,
-        repetition_penalty=flan_alpaca_repetition_penalty,
-        top_p=top_p,
-        temperature=temperature,
-        top_k=top_k
-    )
-    t = Thread(target=model.generate, kwargs=generate_kwargs)
-    t.start()
-    # Pull the generated text from the streamer, and update the model output.
-    import time
-    start = time.time()
-    NUM_TOKENS=0
-    print('-'*4+'Start Generation'+'-'*4)
-    history[-1][1] = ""
-    for new_text in streamer:
-        if new_text == None: new_text = ""
-        history[-1][1] += new_text
-        NUM_TOKENS+=1
-        yield history
-    time_generate = time.time() - start
-    print('\n')
-    print('-'*4+'End Generation'+'-'*4)
-    print(f'Num of generated tokens: {NUM_TOKENS}')
-    print(f'Time for complete generation: {time_generate}s')
-    print(f'Tokens per secound: {NUM_TOKENS/time_generate}')
-    print(f'Time per token: {(time_generate/NUM_TOKENS)*1000}ms')
-def produce_streaming_answer_chatbot_ctrans(history, full_prompt):
-    print("The question is: ")
-    print(full_prompt)
-    tokens = ctrans_llm.tokenize(full_prompt)
-    #config = GenerationConfig(reset=True)
-    # Pull the generated text from the streamer, and update the model output.
-    import time
-    start = time.time()
-    NUM_TOKENS=0
-    print('-'*4+'Start Generation'+'-'*4)
-    history[-1][1] = ""
-    for new_text in ctrans_llm.generate(tokens, top_k=top_k, temperature=temperature, repetition_penalty=repetition_penalty): #ctrans_generate(prompt=tokens, config=config):
-        if new_text == None: new_text =  ""
-        history[-1][1] += ctrans_llm.detokenize(new_text) #new_text
-        NUM_TOKENS+=1
-        yield history
-    time_generate = time.time() - start
-    print('\n')
-    print('-'*4+'End Generation'+'-'*4)
-    print(f'Num of generated tokens: {NUM_TOKENS}')
-    print(f'Time for complete generation: {time_generate}s')
-    print(f'Tokens per secound: {NUM_TOKENS/time_generate}')
-    print(f'Time per token: {(time_generate/NUM_TOKENS)*1000}ms')
-def ctrans_generate(
-    prompt: str,
-    llm=ctrans_llm,
-    config: GenerationConfig = GenerationConfig(),
-):
-    """Run model inference, will return a Generator if streaming is true."""
-    return llm(
-        prompt,
-        **asdict(config),
-    )
-def turn_off_interactivity(user_message, history):
-        return gr.update(value="", interactive=False), history + [[user_message, None]]
-def update_message(dropdown_value):
-        return gr.Textbox.update(value=dropdown_value)
-def hide_block():
-        return gr.Radio.update(visible=False)
 # # Chat history functions
@@ -923,6 +795,8 @@ def add_inputs_answer_to_history(user_message, history, current_topic):
     return history, extracted_memory
 def remove_q_stopwords(question): # Remove stopwords from question. Not used at the moment
     # Prepare keywords from question by removing stopwords
     text = question.lower()
@@ -1003,4 +877,51 @@ def keybert_keywords(text, n, kw_model):
     keywords_list = [item[0] for item in keywords_text]
     return keywords_list

 import re
 import datetime
 from typing import TypeVar, Dict, List, Tuple
+import time
 from itertools import compress
 import pandas as pd
 import numpy as np
 # Model packages
 import torch
 from threading import Thread
 from transformers import AutoTokenizer, pipeline, TextIteratorStreamer
 from dataclasses import asdict, dataclass
 # Langchain functions
 from langchain.prompts import PromptTemplate
 from langchain.vectorstores import FAISS
 from langchain.retrievers import SVMRetriever
 import gradio as gr
+torch.cuda.empty_cache()
 PandasDataFrame = TypeVar('pd.core.frame.DataFrame')
 embeddings = None  # global variable setup
 vectorstore = None # global variable setup
+model_type = None # global variable setup
 max_memory_length = 0 # How long should the memory of the conversation last?
 full_text = "" # Define dummy source text (full text) just to enable highlight function to load
+model = [] # Define empty list for model functions to run
+tokenizer = [] # Define empty list for model functions to run
+## Highlight text constants
+hlt_chunk_size = 15
+hlt_strat = [" ", ".", "!", "?", ":", "\n\n", "\n", ","]
+hlt_overlap = 4
+## Initialise NER model ##
+ner_model = SpanMarkerModel.from_pretrained("tomaarsen/span-marker-mbert-base-multinerd")
+## Initialise keyword model ##
+# Used to pull out keywords from chat history to add to user queries behind the scenes
+kw_model = pipeline("feature-extraction", model="sentence-transformers/all-MiniLM-L6-v2")
+if torch.cuda.is_available():
+        torch_device = "cuda"
+        gpu_layers = 5
+else:
+    torch_device =  "cpu"
+    gpu_layers = 0
+print("Running on device:", torch_device)
+threads = 8 #torch.get_num_threads()
+print("CPU threads:", threads)
+# Flan Alpaca Model parameters
 temperature: float = 0.1
 top_k: int = 3
 top_p: float = 1
 flan_alpaca_repetition_penalty: float = 1.3
 last_n_tokens: int = 64
 max_new_tokens: int = 125
+seed: int = 42
 reset: bool = False
 stream: bool = True
 threads: int = threads
+batch_size:int = 1024
 context_length:int = 4096
 sample = True
+# CtransGen model parameters
+gpu_layers:int = 6 #gpu_layers For serving on Huggingface set to 0 as using free CPU instance
 @dataclass
+class CtransInitConfig_gpu:
     temperature: float = temperature
     top_k: int = top_k
     top_p: float = top_p
     repetition_penalty: float = repetition_penalty
     last_n_tokens: int = last_n_tokens
     max_new_tokens: int = max_new_tokens
+    seed: int = seed
     reset: bool = reset
     stream: bool = stream
     threads: int = threads
     gpu_layers:int = gpu_layers
     #stop: list[str] = field(default_factory=lambda: [stop_string])
+class CtransInitConfig_cpu:
+    temperature: float = temperature
+    top_k: int = top_k
+    top_p: float = top_p
+    repetition_penalty: float = repetition_penalty
+    last_n_tokens: int = last_n_tokens
+    max_new_tokens: int = max_new_tokens
+    seed: int = seed
+    reset: bool = reset
+    stream: bool = stream
+    threads: int = threads
+    batch_size:int = batch_size
+    context_length:int = context_length
+    gpu_layers:int = 0
+    #stop: list[str] = field(default_factory=lambda: [stop_string])
+@dataclass
+class CtransGenGenerationConfig:
+    top_k: int = top_k
+    top_p: float = top_p
+    temperature: float = temperature
+    repetition_penalty: float = repetition_penalty
+    last_n_tokens: int = last_n_tokens
+    seed: int = seed
+    batch_size:int = batch_size
+    threads: int = threads
+    reset: bool = True
 # Vectorstore funcs
     return out_message
+# Prompt functions
+def base_prompt_templates(model_type = "Flan Alpaca"):
     #EXAMPLE_PROMPT = PromptTemplate(
     #    template="\nCONTENT:\n\n{page_content}\n\nSOURCE: {source}\n\n",
         input_variables=["page_content"]
     )
 # The main prompt:
     instruction_prompt_template_alpaca_quote = """### Instruction:
     Response:"""
+    instruction_prompt_template_alpaca = """### Instruction:
+    ### User:
+    Answer the QUESTION using information from the following CONTENT.
+    CONTENT: {summaries}
+    QUESTION: {question}
+    Response:"""
     instruction_prompt_template_orca = """
     ### System:
     You are an AI assistant that follows instruction extremely well. Help as much as you can.
     ### User:
+    Answer the QUESTION with a short response using information from the following CONTENT.
     CONTENT: {summaries}
     QUESTION: {question}
     ### Response:"""
     instruction_prompt_mistral_orca = """<|im_start|>system\n
+    You are an AI assistant that follows instruction extremely well. Help as much as you can.
+    <|im_start|>user\n
+    Answer the QUESTION using information from the following CONTENT. Respond with short answers that directly answer the question.
+    CONTENT: {summaries}
+    QUESTION: {question}\n
+    <|im_end|>"""
+    if model_type == "Flan Alpaca":
+        INSTRUCTION_PROMPT=PromptTemplate(template=instruction_prompt_template_alpaca, input_variables=['question', 'summaries'])
+    elif model_type == "Orca Mini":
+        INSTRUCTION_PROMPT=PromptTemplate(template=instruction_prompt_template_orca, input_variables=['question', 'summaries'])
+    return INSTRUCTION_PROMPT, CONTENT_PROMPT
+def generate_expanded_prompt(inputs: Dict[str, str], instruction_prompt, content_prompt, extracted_memory, vectorstore, embeddings): # ,
+        question =  inputs["question"]
+        chat_history = inputs["chat_history"]
+        new_question_kworded = adapt_q_from_chat_history(question, chat_history, extracted_memory) # new_question_keywords,
+        docs_keep_as_doc, doc_df, docs_keep_out = hybrid_retrieval(new_question_kworded, vectorstore, embeddings, k_val = 5, out_passages = 2,
+                                                                          vec_score_cut_off = 1, vec_weight = 1, bm25_weight = 1, svm_weight = 1)#,
+                                                                          #vectorstore=globals()["vectorstore"], embeddings=globals()["embeddings"])
+        # Expand the found passages to the neighbouring context
+        docs_keep_as_doc, doc_df = get_expanded_passages(vectorstore, docs_keep_out, width=1)
+        if docs_keep_as_doc == []:
+            {"answer": "I'm sorry, I couldn't find a relevant answer to this question.", "sources":"I'm sorry, I couldn't find a relevant source for this question."}
+        # Build up sources content to add to user display
+        doc_df['meta_clean'] = [f"<b>{'  '.join(f'{k}: {v}' for k, v in d.items() if k != 'page_section')}</b>" for d in doc_df['metadata']]
+        doc_df['content_meta'] = doc_df['meta_clean'].astype(str) + ".<br><br>" + doc_df['page_content'].astype(str)
+        modified_page_content = [f" SOURCE {i+1} - {word}" for i, word in enumerate(doc_df['page_content'])]
+        docs_content_string = ''.join(modified_page_content)
+        sources_docs_content_string = '<br><br>'.join(doc_df['content_meta'])#.replace("  "," ")#.strip()
+        instruction_prompt_out = instruction_prompt.format(question=new_question_kworded, summaries=docs_content_string)
+        print('Final prompt is: ')
+        print(instruction_prompt_out)
+        return instruction_prompt_out, sources_docs_content_string, new_question_kworded
+def create_full_prompt(user_input, history, extracted_memory, vectorstore, embeddings, model_type):
+    #if chain_agent is None:
+    #    history.append((user_input, "Please click the button to submit the Huggingface API key before using the chatbot (top right)"))
+    #    return history, history, "", ""
+    print("\n==== date/time: " + str(datetime.datetime.now()) + " ====")
+    print("User input: " + user_input)
+    history = history or []
+    # Create instruction prompt
+    instruction_prompt, content_prompt = base_prompt_templates(model_type=model_type)
+    instruction_prompt_out, docs_content_string, new_question_kworded =\
+                generate_expanded_prompt({"question": user_input, "chat_history": history}, #vectorstore,
+                                    instruction_prompt, content_prompt, extracted_memory, vectorstore, embeddings)
+    history.append(user_input)
+    print("Output history is:")
+    print(history)
+    return history, docs_content_string, instruction_prompt_out
+# Chat functions
+def produce_streaming_answer_chatbot(history, full_prompt, model_type):
+    #print("Model type is: ", model_type)
+    if model_type == "Flan Alpaca":
+        # Get the model and tokenizer, and tokenize the user text.
+        model_inputs = tokenizer(text=full_prompt, return_tensors="pt", return_attention_mask=False).to(torch_device) # return_attention_mask=False was added
+        # Start generation on a separate thread, so that we don't block the UI. The text is pulled from the streamer
+        # in the main thread. Adds timeout to the streamer to handle exceptions in the generation thread.
+        streamer = TextIteratorStreamer(tokenizer, timeout=120., skip_prompt=True, skip_special_tokens=True)
+        generate_kwargs = dict(
+            model_inputs,
+            streamer=streamer,
+            max_new_tokens=max_new_tokens,
+            do_sample=sample,
+            repetition_penalty=flan_alpaca_repetition_penalty,
+            top_p=top_p,
+            temperature=temperature,
+            top_k=top_k
+        )
+        t = Thread(target=model.generate, kwargs=generate_kwargs)
+        t.start()
+        # Pull the generated text from the streamer, and update the model output.
+        start = time.time()
+        NUM_TOKENS=0
+        print('-'*4+'Start Generation'+'-'*4)
+        history[-1][1] = ""
+        for new_text in streamer:
+            if new_text == None: new_text = ""
+            history[-1][1] += new_text
+            NUM_TOKENS+=1
+            yield history
+        time_generate = time.time() - start
+        print('\n')
+        print('-'*4+'End Generation'+'-'*4)
+        print(f'Num of generated tokens: {NUM_TOKENS}')
+        print(f'Time for complete generation: {time_generate}s')
+        print(f'Tokens per secound: {NUM_TOKENS/time_generate}')
+        print(f'Time per token: {(time_generate/NUM_TOKENS)*1000}ms')
+    elif model_type == "Orca Mini":
+        tokens = model.tokenize(full_prompt)
+        # Pull the generated text from the streamer, and update the model output.
+        start = time.time()
+        NUM_TOKENS=0
+        print('-'*4+'Start Generation'+'-'*4)
+        history[-1][1] = ""
+        for new_text in model.generate(tokens, **asdict(CtransGenGenerationConfig())): #CtransGen_generate(prompt=full_prompt)#, config=CtransGenGenerationConfig()): # #top_k=top_k, temperature=temperature, repetition_penalty=repetition_penalty,
+            if new_text == None: new_text =  ""
+            history[-1][1] += model.detokenize(new_text) #new_text
+            NUM_TOKENS+=1
+            yield history
+        time_generate = time.time() - start
+        print('\n')
+        print('-'*4+'End Generation'+'-'*4)
+        print(f'Num of generated tokens: {NUM_TOKENS}')
+        print(f'Time for complete generation: {time_generate}s')
+        print(f'Tokens per secound: {NUM_TOKENS/time_generate}')
+        print(f'Time per token: {(time_generate/NUM_TOKENS)*1000}ms')
+# Chat helper functions
 def adapt_q_from_chat_history(question, chat_history, extracted_memory, keyword_model=""):#keyword_model): # new_question_keywords,
     # Step 1: Filter vstore_docs
     vstore_docs = get_docs_from_vstore(vectorstore)
     doc_sources = {doc.metadata['source'] for doc, _ in docs}
     vstore_docs = [(k, v) for k, v in vstore_docs if v.metadata.get('source') in doc_sources]
     return expanded_docs, doc_df
 def highlight_found_text(search_text: str, full_text: str, hlt_chunk_size:int=hlt_chunk_size, hlt_strat:List=hlt_strat, hlt_overlap:int=hlt_overlap) -> str:
     """
     Highlights occurrences of search_text within full_text.
     pos_tokens = []
     prev_end = 0
     for start, end in combined_positions:
+        if end-start > 15: # Only combine if there is a significant amount of matched text. Avoids picking up single words like 'and' etc.
+            pos_tokens.append(full_text[prev_end:start])
+            pos_tokens.append('<mark style="color:black;">' + full_text[start:end] + '</mark>')
+            prev_end = end
     pos_tokens.append(full_text[prev_end:])
     return "".join(pos_tokens)
 # # Chat history functions
     return history, extracted_memory
+# Keyword functions
 def remove_q_stopwords(question): # Remove stopwords from question. Not used at the moment
     # Prepare keywords from question by removing stopwords
     text = question.lower()
     keywords_list = [item[0] for item in keywords_text]
     return keywords_list
+# Gradio functions
+def turn_off_interactivity(user_message, history):
+        return gr.update(value="", interactive=False), history + [[user_message, None]]
+def restore_interactivity():
+        return gr.update(interactive=True)
+def update_message(dropdown_value):
+        return gr.Textbox.update(value=dropdown_value)
+def hide_block():
+        return gr.Radio.update(visible=False)
+# Vote function
+def vote(data: gr.LikeData, chat_history, instruction_prompt_out, model_type):
+    import os
+    import pandas as pd
+    chat_history_last = str(str(chat_history[-1][0]) + " - " + str(chat_history[-1][1]))
+    response_df = pd.DataFrame(data={"thumbs_up":data.liked,
+                                        "chosen_response":data.value,
+                                          "input_prompt":instruction_prompt_out,
+                                          "chat_history":chat_history_last,
+                                          "model_type": model_type,
+                                          "date_time": pd.Timestamp.now()}, index=[0])
+    if data.liked:
+        print("You upvoted this response: " + data.value)
+        if os.path.isfile("thumbs_up_data.csv"):
+             existing_thumbs_up_df = pd.read_csv("thumbs_up_data.csv")
+             thumbs_up_df_concat = pd.concat([existing_thumbs_up_df, response_df], ignore_index=True).drop("Unnamed: 0",axis=1, errors="ignore")
+             thumbs_up_df_concat.to_csv("thumbs_up_data.csv")
+        else:
+            response_df.to_csv("thumbs_up_data.csv")
+    else:
+        print("You downvoted this response: " + data.value)
+        if os.path.isfile("thumbs_down_data.csv"):
+             existing_thumbs_down_df = pd.read_csv("thumbs_down_data.csv")
+             thumbs_down_df_concat = pd.concat([existing_thumbs_down_df, response_df], ignore_index=True).drop("Unnamed: 0",axis=1, errors="ignore")
+             thumbs_down_df_concat.to_csv("thumbs_down_data.csv")
+        else:
+            response_df.to_csv("thumbs_down_data.csv")

requirements.txt CHANGED Viewed

@@ -1,23 +1,17 @@
 langchain
 beautifulsoup4
 pandas
-black
-isort
-Flask
 transformers
 --extra-index-url https://download.pytorch.org/whl/cu113
 torch
 sentence_transformers
 faiss-cpu
 bitsandbytes
-accelerate
-optimum
 pypdf
-gradio==3.47.1
-gradio_client==0.6.0
 python-docx
-gpt4all
 ctransformers[cuda]
 keybert
 span_marker
-gensim

 langchain
 beautifulsoup4
 pandas
 transformers
 --extra-index-url https://download.pytorch.org/whl/cu113
 torch
 sentence_transformers
 faiss-cpu
 bitsandbytes
 pypdf
 python-docx
 ctransformers[cuda]
 keybert
 span_marker
+gensim
+gradio==3.42.0
+gradio_client

thumbs_up_data.csv ADDED Viewed

	@@ -0,0 +1,36 @@

+,thumbs_up,chosen_response,input_prompt,chat_history,model_type,date_time
+0,True,"The vision statement for Lambeth is ""a borough with social and climate justice at its heart"".","### Instruction:
+    ### User:
+    Answer the QUESTION using information from the following CONTENT.
+    CONTENT:  SOURCE 1 - - attended by over 150 people7 focussed workshops with local Lambeth organisations and their services-users  - attended by over 80 people 2 weeks of market research across public spaces in Lambeth, asking people their vision for Lambeth in 20301 Lambeth 2030 consultation survey  open for 6 weeks to the public with over 600 responses 14 | Lambeth 2030 Our Future, Our Lambeth Lambeth 2030 Our Future, Our Lambeth |   15The vision Through listening and building on what we already know, we’ve created a vision for the future of Lambeth that’s rooted in what people want. This is a vision that belongs to everyone.Achieving this future vision of Lambeth comes down to all of us. We are all connected, and we all have a stake in Lambeth to make it the best place to live, work and visit in the UK. From our conversations we know people agree with a group of core priorities and ambitions for the future of Lambeth. They are ready to come together and bring this vision to life, and there is also strong support in the shift towards taking a longer-term view, so that we are ready for the unforeseen challenges of t SOURCE 2 - e of Lambeth. This Borough Plan will not have all the answers to the challenges we face but it is our commitment to everyone in Lambeth that we will strive to get the basics right, and that we will harness the abundance of local expertise, energy and passion in our design and decision-making so that everybody in the borough is empowered to create Lambeth 2030. This is Our Future; This is Our Lambeth. 06 | Lambeth 2030 Our Future, Our Lambeth 07 Lambeth 2030 Our Future, Our Lambeth |Lambeth 2030 Vision Statement  Lambeth – a borough  with social and climate  justice at its heart. By harnessing the power and pride of our people and partnerships, we will proactively tackle inequalities so that children and young people can have the best start in life and so everyone can feel safe and thrive in a place of opportunity. SUSTAINABLE OPPORTUNITY HEALTHY COMMUNITY SAFER 08 | Lambeth 2030 Our Future, Our Lambeth 09 Lambeth 2030 Our Future, Our Lambeth |State of  the Borough At 22,200 Lambeth has the largest LGBTQ+ population in London The (mean) average house price in  Lambeth is £689,009 12th highest in London 17.3% of Lambeth is green space 5th lowest in London 317,600 Lambeth is an inner  south London borough  with 317,600 residents 9th largest population in London Lambeth’s population is diverse and multicultural Asian, Asian British – 7.3% Black British, African or Caribbean – 24.0% Mixed or Multiple Ethnic groups – 8.1% White British – 4
+    QUESTION: What is the vision statement for Lambeth?
+    Response:","What is the vision statement for Lambeth? - The vision statement for Lambeth is ""a borough with social and climate justice at its heart"".",Flan Alpaca,2023-10-09 23:10:04.119596
+1,True,"The commitments for Lambeth are: 1. We will take a one borough approach to deliver our services consistently and well 2. People have a say and stake in the decisions that matter 3. We will collaborate with our people and partners to innovate and implement together 4. We will focus on what our residents want and be honest about what we can and can’t do, whilst being courageous to take bold action.","### Instruction:
+    ### User:
+    Answer the QUESTION using information from the following CONTENT.
+    CONTENT:  SOURCE 1 - at the changes we make, and the partnerships we form, are all contributing to improving the lives of those who live, work, learn and visit our borough. -Our Future, Our Lambeth Our Vision Statement Lambeth – a borough with social and climate justice at its heart. By harnessing the power and pride of our people and partnerships, we will proactively tackle inequalities so that children and young people can have the best start in life and so everyone can feel safe and thrive in a place of opportunity. Our Commitments for Lambeth 1. We get the basics right and deliver great  public services that fit what people need  -We will take a one borough approach to deliver our services consistently and well 2. People have a say and stake in the decisions that matter  -We will be a listening and open borough that recognises and values our community voices 3. We work together in partnership, to harness what makes Lambeth special  - We will collaborate with our people and partners  to innovate and implement together 4. We are accessible, transparent and we stand  up to challenges  -We will focus on what our residents want and be honest about what we can and can’t do, whilst being courageous to take bold action. SOURCE 2 - Achieving this future vision of Lambeth comes down to all of us. We are all connected, and we all have a stake in Lambeth to make it the best place to live, work and visit in the UK. From our conversations we know people agree with a group of core priorities and ambitions for the future of Lambeth. They are ready to come together and bring this vision to life, and there is also strong support in the shift towards taking a longer-term view, so that we are ready for the unforeseen challenges of the future. Our Commitments for Lambeth To deliver this vision requires individual and collective commitment and action. This means the Council and Lambeth’s communities and organisations coming together and standing as one to transform the ways we work. By listening to our communities, understanding their experiences, and aligning our priorities with theirs, we can build confidence between Lambeth’s institutions, businesses, community groups and organisations, and residents, and we can ensure that the changes we make, and the partnerships we form, are all contributing to improving the lives of those who live, work, learn and visit our borough. -Our Future, Our Lambeth Our Vision Statement Lambeth – a borough with social and climate justice at its heart. By harnessing the power and pride of our people and partnerships, we will proactively tackle inequalities so that children and young people can have the best start in life and so everyone can feel safe and thrive in a place of opportuni
+    QUESTION: What are the commitments for Lambeth?
+    Response:","What are the commitments for Lambeth? - The commitments for Lambeth are: 1. We will take a one borough approach to deliver our services consistently and well 2. People have a say and stake in the decisions that matter 3. We will collaborate with our people and partners to innovate and implement together 4. We will focus on what our residents want and be honest about what we can and can’t do, whilst being courageous to take bold action.",Flan Alpaca,2023-10-09 23:10:43.075937
+2,True,"The 2030 outcomes for Lambeth are:
+ 1. Lambeth will have lower levels of deprivation, with fewer children growing up in poverty.
+2. Lambeth will tackle the structural inequalities adversely impacting Black, Asian and Multi-Ethnic residents by being a borough of antiracism.
+3. Lambeth will be a borough of progress, working with LGBTQ+ communities and disabled residents to tackle the biggest challenges they face.
+4. By 2030, Lambeth will be a Net Zero Borough.
+5. Lambeth residents will experience good health and wellbeing, with an improved healthy life expectancy for those with the poorest outcomes.
+6. Lambeth will be a borough of equity and justice, with a focus on making neighbourhoods fit for the future.","
+    ### System:
+    You are an AI assistant that follows instruction extremely well. Help as much as you can.
+    ### User:
+    Answer the QUESTION with a short response using information from the following CONTENT.
+    CONTENT:  SOURCE 1 - eekers and raise the voices of people with lived-experience. We will be led by five core values: Inclusivity, Openness, Participation, Inspiration and Integrity. To drive this forward, we have created the Lambeth Sanctuary Forum, a multi-agency group working with the voluntary and community sector, structured to deliver the priorities of our sanctuary-seekers, with humanity and compassion. 36 | Lambeth 2030 Our Future, Our Lambeth Lambeth 2030 Our Future, Our Lambeth |   37Our Lambeth 2030 Outcomes Our ambitions are bold – it is going to take everyone in the borough to play their part in delivering for Lambeth, ensuring that we are all accountable and committed to a better future for everyone. Our Lambeth Outcomes have been shaped to unite us in that effort. A Borough of Equity and Justice • By 2030, Lambeth will have lower levels of deprivation, with fewer children growing up  in poverty. • By 2030, Lambeth will tackle the structural inequalities adversely impacting Black, Asian and Multi-Ethnic residents by being a borough of antiracism. • By 2030, Lambeth will be a borough of progress, working with LGBTQ+ communities and disabled residents to tackle the biggest challenges they face. Our 2030 Ambition: Making Lambeth Neighbourhoods Fit for the Future • By 2030, Lambeth will be a Net Zero Borough. • By 2030, Lambeth residents will experience good health and wellbeing, with an improved healthy life expectancy for those with the poorest outcomes. • By 203 SOURCE 2 - Lambeth 2030 Contents Forewords 04 Introduction 06 State of the Borough 10 Our Previous Borough Plan 12 Our Shared Vision for Lambeth 2030 14 Our Ambitions for Lambeth 2030 18 The Lambeth Golden Thread – A Borough of Equity and Justice 20 Ambition 1 – Making Lambeth Neighbourhoods Fit for the Future 22 Ambition 2 – Making Lambeth One of the Safest Boroughs in London 28 Ambition 3 – Making Lambeth A Place We Can All Call Home 32 Our Lambeth 2030 Outcomes 38 03 Lambeth 2030 Our Future, Our Lambeth|  02 | Lambeth 2030 Our Future, Our Lambeth
+    QUESTION: What are the 2030 outcomes for Lambeth?
+    ### Response:","What are the 2030 outcomes for Lambeth? - The 2030 outcomes for Lambeth are:
+ 1. Lambeth will have lower levels of deprivation, with fewer children growing up in poverty.
+2. Lambeth will tackle the structural inequalities adversely impacting Black, Asian and Multi-Ethnic residents by being a borough of antiracism.
+3. Lambeth will be a borough of progress, working with LGBTQ+ communities and disabled residents to tackle the biggest challenges they face.
+4. By 2030, Lambeth will be a Net Zero Borough.
+5. Lambeth residents will experience good health and wellbeing, with an improved healthy life expectancy for those with the poorest outcomes.
+6. Lambeth will be a borough of equity and justice, with a focus on making neighbourhoods fit for the future.",Orca Mini,2023-10-09 23:13:38.912570