can I use this model to extract text from an entire document?

#3
by sergenti - opened

Hey there, I am working on a PDF parsing project.

Is there a way to use this model to extract an entire page?

OR - are there any other models capable of extracting text from images like these? (don't mind the red rectangle)
I tried other python libraries and the results are bad

image.png

P.S. yes, I am using another model to detect tables and remove them in order to improve the parsing

P.P.S. yes, the image above is taken from "attention is all you need" lol

Hey there, I am working on a PDF parsing project.

Is there a way to use this model to extract an entire page?

OR - are there any other models capable of extracting text from images like these? (don't mind the red rectangle)
I tried other python libraries and the results are bad

image.png

P.S. yes, I am using another model to detect tables and remove them in order to improve the parsing

P.P.S. yes, the image above is taken from "attention is all you need" lol
maybe you can try layoutlmv3 ,which can analysis document layout,help detect table ,title,text,etc

At the end did you find an answer for extract an entire page?

At the end did you find an answer for extract an entire page?

i'm trying

Sign up or log in to comment