Spaces:

nielsr
/

donut-docvqa

Running

Applying to pdfs

by calebaryee321 - opened Sep 27, 2022

Sep 27, 2022

I want to apply this to pdf's by turning them to images and running them through the model. If this was to work, will it theoretically be able to search through all the images for an answer to a question?

awacke1

Sep 27, 2022

I tried this with sample images and it worked as long as the text was clear and the data was columnar like a table. Your word choice and prompting of labels also must match. If I did it with those constraints the answer was near perfect. I did notice for multi-line boxing, say a long sentence that continues on second line in a cell, that it doesn't get the first line, however it performed better than Layout2LM which was super impressive.

calebaryee321

Sep 27, 2022

@awacke1 thats awesome, do you mind sharing your code?

thefirebanks

Mar 4, 2023

@calebaryee321 did you end up finding a way of applying this to PDFs? Or at least, a way of extracting the text/content from the PDF to feed them to another QA model?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment