Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs Paper ā¢ 2404.05719 ā¢ Published Apr 8, 2024 ā¢ 82
Harnessing Webpage UIs for Text-Rich Visual Understanding Paper ā¢ 2410.13824 ā¢ Published Oct 17, 2024 ā¢ 30
DocLayout-YOLO Collection Dataset and model for DocLayout-YOLO ā¢ 10 items ā¢ Updated 21 days ago ā¢ 12
Flex3D: Feed-Forward 3D Generation With Flexible Reconstruction Model And Input View Curation Paper ā¢ 2410.00890 ā¢ Published Oct 1, 2024 ā¢ 19