Applying to pdfs

#4
by calebaryee321 - opened

I want to apply this to pdf's by turning them to images and running them through the model. If this was to work, will it theoretically be able to search through all the images for an answer to a question?

I tried this with sample images and it worked as long as the text was clear and the data was columnar like a table. Your word choice and prompting of labels also must match. If I did it with those constraints the answer was near perfect. I did notice for multi-line boxing, say a long sentence that continues on second line in a cell, that it doesn't get the first line, however it performed better than Layout2LM which was super impressive.

@awacke1 thats awesome, do you mind sharing your code?

@calebaryee321 did you end up finding a way of applying this to PDFs? Or at least, a way of extracting the text/content from the PDF to feed them to another QA model?

Sign up or log in to comment