Distilled Decoding 1: One-step Sampling of Image Auto-regressive Models with Flow Matching Paper • 2412.17153 • Published 3 days ago • 29
Deliberation in Latent Space via Differentiable Cache Augmentation Paper • 2412.17747 • Published 3 days ago • 25
Diving into Self-Evolving Training for Multimodal Reasoning Paper • 2412.17451 • Published 3 days ago • 35
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners Paper • 2412.17256 • Published 3 days ago • 35
RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response Paper • 2412.14922 • Published 7 days ago • 74
In Case You Missed It: ARC 'Challenge' Is Not That Challenging Paper • 2412.17758 • Published 3 days ago • 10
3DGraphLLM: Combining Semantic Graphs and Large Language Models for 3D Scene Understanding Paper • 2412.18450 • Published 1 day ago • 28
Large Action Models: From Inception to Implementation Paper • 2412.10047 • Published 13 days ago • 28
Apollo: An Exploration of Video Understanding in Large Multimodal Models Paper • 2412.10360 • Published 13 days ago • 131
Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published 13 days ago • 75
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks Paper • 2412.14161 • Published 8 days ago • 43
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper • 2412.13663 • Published 8 days ago • 103
LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks Paper • 2412.15204 • Published 7 days ago • 31
MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval Paper • 2412.14475 • Published 7 days ago • 51