SoS1: O1 and R1-Like Reasoning LLMs are Sum-of-Square Solvers Paper • 2502.20545 • Published 10 days ago • 20
Multi-Turn Code Generation Through Single-Step Rewards Paper • 2502.20380 • Published 11 days ago • 29
Awesome Computer Use Agents Collection https://github.com/ranpox/awesome-computer-use • 25 items • Updated Dec 18, 2024 • 11
Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction Paper • 2412.04454 • Published Dec 5, 2024 • 64
Qwen2.5-VL (All Versions) Collection All versions of Qwen2.5-VL including 4-bit, 16-bit and more! • 9 items • Updated 11 days ago • 8
view article Article Docmatix - a huge dataset for Document Visual Question Answering Jul 18, 2024 • 72
THOUGHTSCULPT: Reasoning with Intermediate Revision and Search Paper • 2404.05966 • Published Apr 9, 2024 • 2
FinTral: A Family of GPT-4 Level Multimodal Financial Large Language Models Paper • 2402.10986 • Published Feb 16, 2024 • 78
WebLINX: Real-World Website Navigation with Multi-Turn Dialogue Paper • 2402.05930 • Published Feb 8, 2024 • 39
Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding Paper • 2401.12954 • Published Jan 23, 2024 • 30
Self-Discover: Large Language Models Self-Compose Reasoning Structures Paper • 2402.03620 • Published Feb 6, 2024 • 115
Scaling Laws for Downstream Task Performance of Large Language Models Paper • 2402.04177 • Published Feb 6, 2024 • 18
CodeIt: Self-Improving Language Models with Prioritized Hindsight Replay Paper • 2402.04858 • Published Feb 7, 2024 • 15
ScreenAI: A Vision-Language Model for UI and Infographics Understanding Paper • 2402.04615 • Published Feb 7, 2024 • 43
Direct Language Model Alignment from Online AI Feedback Paper • 2402.04792 • Published Feb 7, 2024 • 31