Transformers SetFit pypdf2 openpyxl pdf2image pytesseract Pillow