bs4 fuzzywuzzy nltk numpy scikit-learn==1.0.2 streamlit pandas distance regex