Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs Paper • 2404.05719 • Published Apr 8, 2024 • 83
The Unreasonable Ineffectiveness of the Deeper Layers Paper • 2403.17887 • Published Mar 26, 2024 • 81
How Far Are We from Intelligent Visual Deductive Reasoning? Paper • 2403.04732 • Published Mar 7, 2024 • 24