Using this model as a QA-tool/OCR on a text heavy document?

by Techie5879 - opened

Can this model be accurately used as a QA-tool/OCR text extraction from a text heavy document? Current OCR solutions struggle with multi column formats. We want to be able to extract text efficiently from such documents and put them into a structured format (JSON). Can this model be used for this? And if so, does it have to be fine tuned or can we try out-of-the box inference by changing the test image and the prompt and expect results?

I don't know if the model was trained/tested on text-heavy documents @Techie5879 - as a matter of fact I'll evaluate it on a document QA task soon because I'm curious about it as well

Working the problem myself, in the meantime any updates @Molbap ?

Sign up or log in to comment