torch torchvision transformers datasets pytesseract opencv-python pdf2image pypdf langdetect gradio