huggingface langchain sentence_transformers transformers torch tensorflow gradio pdfminer.six cache docx2txt