TransMamba: Flexibly Switching between Transformer and Mamba Paper • 2503.24067 • Published 22 days ago • 19
TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes Paper • 2503.23461 • Published 23 days ago • 94
Thinking Slow, Fast: Scaling Inference Compute with Distilled Reasoners Paper • 2502.20339 • Published Feb 27 • 2
Hogwild! Inference: Parallel LLM Generation via Concurrent Attention Paper • 2504.06261 • Published 14 days ago • 103 • 6
Hogwild! Inference: Parallel LLM Generation via Concurrent Attention Paper • 2504.06261 • Published 14 days ago • 103
Kolmogorov-Arnold Attention: Is Learnable Attention Better For Vision Transformers? Paper • 2503.10632 • Published Mar 13 • 14
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published Jan 8 • 277
CodeAgent: Enhancing Code Generation with Tool-Integrated Agent Systems for Real-World Repo-level Coding Challenges Paper • 2401.07339 • Published Jan 14, 2024 • 1