Hogwild! Inference: Parallel LLM Generation via Concurrent Attention Paper • 2504.06261 • Published 3 days ago • 83
Generative Evaluation of Complex Reasoning in Large Language Models Paper • 2504.02810 • Published 8 days ago • 11
Promote, Suppress, Iterate: How Language Models Answer One-to-Many Factual Queries Paper • 2502.20475 • Published Feb 27 • 3
IHEval: Evaluating Language Models on Following the Instruction Hierarchy Paper • 2502.08745 • Published Feb 12 • 19
I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models Paper • 2502.10458 • Published Feb 12 • 34
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper • 2502.11089 • Published Feb 16 • 151
OctoTools: An Agentic Framework with Extensible Tools for Complex Reasoning Paper • 2502.11271 • Published Feb 16 • 17
Flow-of-Options: Diversified and Improved LLM Reasoning by Thinking Through Options Paper • 2502.12929 • Published Feb 18 • 7
HeadInfer: Memory-Efficient LLM Inference by Head-wise Offloading Paper • 2502.12574 • Published Feb 18 • 11
Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity Paper • 2502.13063 • Published Feb 18 • 69
One Example Shown, Many Concepts Known! Counterexample-Driven Conceptual Reasoning in Mathematical LLMs Paper • 2502.10454 • Published Feb 12 • 7
Talk Structurally, Act Hierarchically: A Collaborative Framework for LLM Multi-Agent Systems Paper • 2502.11098 • Published Feb 16 • 13
System Message Generation for User Preferences using Open-Source Models Paper • 2502.11330 • Published Feb 17 • 15
Building A Proof-Oriented Programmer That Is 64% Better Than GPT-4o Under Data Scarsity Paper • 2502.11901 • Published Feb 17 • 6