Stefano Fiorucci
Update README.md
3dcd3ce
metadata
title: Who killed Laura Palmer?
emoji: πŸ—»πŸ—»
colorFrom: blue
colorTo: green
sdk: streamlit
sdk_version: 1.2.0
app_file: app.py
pinned: false
license: apache-2.0

Who killed Laura Palmer?   Generic badge Generic badge

πŸ—»πŸ—» Twin Peaks Question Answering system

WKLP is a simple Question Answering system, based on data crawled from Twin Peaks Wiki. It is built using πŸ” Haystack, an awesome open-source framework for building search systems that work intelligently over large document collections.


Project architecture 🧱

Project architecture


What can I learn from this project? πŸ“š

  • How to quickly ⌚ build a modern Question Answering system using πŸ” Haystack
  • How to generate questions based on your documents
  • How to build a nice Streamlit web app to show your QA system
  • How to optimize the web app to πŸš€ deploy in πŸ€— Spaces

Web app preview

Repository structure πŸ“

Within each folder, you can find more in-depth explanations.

Installation πŸ’»

To install this project locally, follow these steps:

  • git clone https://github.com/anakin87/who-killed-laura-palmer
  • cd who-killed-laura-palmer
  • pip install -r requirements.txt

To run the web app, simply type: streamlit run app.py

Possible improvements ✨

Project structure

  • The project is optimized to be deployed in Hugging Face Spaces and consists of an all-in-one Streamlit web app. In more structured production environments, I suggest dividing the software into three parts:

Reader

  • The reader model (deepset/roberta-base-squad2) is a good compromise between speed and accuracy, running on CPU. There are certainly better (and more computationally expensive) models, as you can read in the Haystack documentation.
  • You can also think about preparing a Twin Peaks QA dataset and fine-tuning the reader model to get better accuracy, as explained in this Haystack tutorial.