import streamlit as st def app(): with open('style.css') as f: st.markdown(f"", unsafe_allow_html=True) footer = """ """ st.markdown(footer, unsafe_allow_html=True) st.subheader("Intro") intro = """
Wikipedia Assistant is an example of a task usually referred to as the Long-Form Question Answering (LFQA). These systems function by querying large document stores for relevant information and subsequently using the retrieved documents to generate accurate, multi-sentence answers. The documents related to a given query, colloquially called context passages, are not used merely as source tokens for extracted answers, but instead provide a larger context for the synthesis of original, abstractive long-form answers. LFQA systems usually consist of three components:

""" st.markdown(intro, unsafe_allow_html=True) st.image("lfqa.png", caption="LFQA Architecture") st.subheader("UI/UX") st.write("Each sentence in the generated answer ends with a coloured tooltip; the colour ranges from red to green. " "The tooltip contains a value representing answer sentence similarity to a specific sentence in the " "Wikipedia context passages retrieved. Mouseover on the tooltip will show the sentence from the " "Wikipedia context passage. If a sentence similarity is 1.0, the seq2seq model extracted and " "copied the sentence verbatim from Wikipedia context passages. Lower values of sentence " "similarity indicate the seq2seq model is struggling to generate a relevant sentence for the question " "asked.") st.image("wikipedia_answer.png", caption="Answer with similarity tooltips") st.write("Below the generated answer are question-related Wikipedia context paragraphs (passages). One can view " "these passages in a raw format retrieved using the 'Paragraphs' select menu option. The 'Sentences' menu " "option shows the same paragraphs but on a sentence level. Finally, the 'Answer Similarity' menu option " "shows the most similar three sentences from context paragraphs to each sentence in the generated answer.") st.image("wikipedia_context.png", caption="Context paragraphs (passages)") tts = """
Wikipedia Assistant converts the text-based answer to speech via either Google text-to-speech engine or Espnet model hosted on HuggingFace hub

""" st.markdown(tts, unsafe_allow_html=True) st.subheader("Tips") tips = """
LFQA task is far from solved. Wikipedia Assistant will sometimes generate an answer unrelated to a question asked, even downright wrong. However, if the question is elaborate and more specific, there is a decent chance of getting a legible answer. LFQA systems are targeting ELI5 non-factoid type of questions. A general guideline is - questions starting with why, what, and how are better suited than where and who questions. Be elaborate.

For example, to ask a science-based question, Wikipedia Assistant is better suited to answer the question: "Why do airplane jet engines leave contrails in the sky?" than "Why do contrails exist?". Detailed and precise questions are more likely to match the right half a dozen relevant passages in a 20+ GB Wikipedia dump to construct a good answer.

""" st.markdown(tips, unsafe_allow_html=True) st.subheader("Technical details") techinical_intro = """
A question asked will be encoded with an encoder and sent to a server to find the most relevant Wikipedia passages. The Wikipedia passages were previously encoded using a passage encoder and stored in the Faiss index. The question matching passages (a.k.a context passages) are retrieved from the Faiss index and passed to a BART-based seq2seq model to synthesize an original answer to the question.
""" st.markdown(techinical_intro, unsafe_allow_html=True)