torch transformers langchain openai watermark chromadb tiktoken youtube-transcript-api pytube sentence_transformers InstructorEmbedding xformers unstructured llama-cpp-python pdf2image pdfminer