openai llama-index langchain chromadb torch transformers gradio tiktoken scipy scikit-learn mosestokenizer indic-nlp-library sentence_transformers faiss-cpu googletrans==3.1.0a0 BeautifulSoup4 pypdf PyPDF2 html2text