How to persist files when using Streamlit on Hugging Face Spaces?

#1
by JBHF - opened

How to persist files when using Streamlit on Hugging Face Spaces?

I have made a Streamlit Retrieval Augmented Generation system (RAG) which generates a vector store (FAISS index) based on already uploaded pdf files. But it regenerates this vector store again every time you restart the Streamlit session, and that takes several minutes each time before the user can input his query!
And that problem gets worse if more documents are uploaded to the Streamlit application!

So I want to persist the FAISS vector store in a file somewhere, so that it can be read in from that storage location. That will take almost no time and the application will be much faster.

I am searching for a solution that is as cheap as possible, I prefer for free. Can I write the file to a GitHub account or a Google Drive account or something like that? Perhaps even somewhere on Hugging Face (Spaces) itself (for free)?

Can I use the methods described here for that purpose?:
https://huggingface.co/docs/huggingface_hub/package_reference/hf_file_system

Sign up or log in to comment