PyPDF2 scikit-learn transformers PyMuPDF pytesseract pillow tensorflow torch faiss-cpu numpy