SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion Paper • 2503.11576 • Published 27 days ago • 92
xLAM-2 Collection A family of Large Action Model for multi-turn conversation and tool-use • 8 items • Updated 3 days ago • 9
VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks Paper • 2504.05118 • Published 3 days ago • 22
SmolVLM: Redefining small and efficient multimodal models Paper • 2504.05299 • Published 3 days ago • 142
Hogwild! Inference: Parallel LLM Generation via Concurrent Attention Paper • 2504.06261 • Published 2 days ago • 81
Qwen2.5-Coder Collection Code-specific model series based on Qwen2.5 • 40 items • Updated Nov 28, 2024 • 302
Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving Paper • 2504.02605 • Published 7 days ago • 39
APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay Paper • 2504.03601 • Published 6 days ago • 13
SynWorld: Virtual Scenario Synthesis for Agentic Action Knowledge Refinement Paper • 2504.03561 • Published 6 days ago • 16
Llama 4 Collection Meta's new Llama 4 multimodal models, Scout & Maverick. Includes Dynamic GGUFs, 16-bit & Dynamic 4-bit uploads. Run & fine-tune them with Unsloth! • 15 items • Updated 2 days ago • 42
Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models Paper • 2503.22165 • Published 14 days ago • 26
Interpreting Emergent Planning in Model-Free Reinforcement Learning Paper • 2504.01871 • Published 8 days ago • 11
Reasoning-SQL: Reinforcement Learning with SQL Tailored Partial Rewards for Reasoning-Enhanced Text-to-SQL Paper • 2503.23157 • Published 12 days ago • 7
Understanding R1-Zero-Like Training: A Critical Perspective Paper • 2503.20783 • Published 15 days ago • 36