sentence-transformers datasets torch streamlit-chat-media streamlit-chat transformers PyPDF2 ratelimit backoff tqdm openai PyMuPDF # instead of fitz nltk langchain_community langchain