Prasanna Iyer's picture
1

Prasanna Iyer

prasiyer

AI & ML interests

None yet

Recent Activity

Organizations

None yet

prasiyer's activity

reacted to fdaudens's post with ๐Ÿ”ฅ about 2 months ago
view post
Post
3110
Is this the best tool to extract clean info from PDFs, handwriting and complex documents yet?

Open source olmOCR just dropped and the results are impressive.

Tested the free demo with various documents, including a handwritten Claes Oldenburg letter. The speed is impressive: 3000 tokens/second on your own GPU - that's 1/32 the cost of GPT-4o ($190/million pages). Game-changer for content extraction and digital archives.

To achieve this, Ai2 trained a 7B vision language model on 260K pages from 100K PDFs using "document anchoring" - combining PDF metadata with page images.

Best part: it actually understands document structure (columns, tables, equations) instead of just jumbling everything together like most OCR tools. Their human eval results back this up.

๐Ÿ‘‰ Try the demo: https://olmocr.allenai.org

Going right into the AI toolkit: JournalistsonHF/ai-toolkit
  • 3 replies
ยท
reacted to nroggendorff's post with ๐Ÿ‘ 3 months ago
view post
Post
1263
maybe a page where you can find open orgs to get started in collaboration with hf. i see so many people that dont have a direction.


i dont have ulterior motives, so dont ask
  • 1 reply
ยท
replied to merve's post 9 months ago
view reply

I have attached the image and the prompt. This is the response from the chatbot -

The image depicts a table comparing various companies and their carbon footprints. The table lists the top 10 largest greenhouse gas emitters in the world, with their respective carbon footprints expressed in millions of metric tons of carbon dioxide equivalent (MtCO2e). The first column features the company name, while the second column displays the year in which the carbon footprint was measured. The third column indicates the sector in which the company operates, and the fourth column provides the company's carbon footprint. The last column shows the company's market capitalization, which is the total value of all outstanding shares of a company's stock. The table also includes a row labeled "Total" that shows the total carbon footprint of all the companies listed in the table.

replied to merve's post 9 months ago
replied to merve's post 9 months ago
view reply

Thanks for the post and your efforts to share the knowledge.

https://huggingface.co/spaces/merve/chameleon-7b -- Space does not seem to work. When I ask for a summary of an image, the result is a summary of some random table and not of the one I uploaded. Please check when you can