working mechanism
#11
by
BoccheseGiacomo
- opened
I have a question: do this only works with text documents or also images? if i have a pdf formatted as image, do this work? and if i have a pdf with tables, do it convert all to raw text utf-8 or is able to process structures (images,tables,html text) as they are?
Thanks
As far as I can tell, it's just text from the images. and needs to be in a "segmentId" format.
However, check katanami here and also git https://github.com/katanaml/sparrow
thanks for the github repo, that's really cool
BoccheseGiacomo
changed discussion status to
closed