PyPDF2 pdf2image pytesseract streamlit langchain deeplake assemblyai sentence-transformers youtube-transcript-api langchain-google-genai