TxAgent: An AI Agent for Therapeutic Reasoning Across a Universe of Tools Paper • 2503.10970 • Published 4 days ago • 10
GoalFlow: Goal-Driven Flow Matching for Multimodal Trajectories Generation in End-to-End Autonomous Driving Paper • 2503.05689 • Published 10 days ago • 2
Kolmogorov-Arnold Attention: Is Learnable Attention Better For Vision Transformers? Paper • 2503.10632 • Published 4 days ago • 8
Cockatiel: Ensembling Synthetic and Human Preferenced Training for Detailed Video Caption Paper • 2503.09279 • Published 6 days ago • 5
FlowTok: Flowing Seamlessly Across Text and Image Tokens Paper • 2503.10772 • Published 4 days ago • 12
Technologies on Effectiveness and Efficiency: A Survey of State Spaces Models Paper • 2503.11224 • Published 4 days ago • 21
ReCamMaster: Camera-Controlled Generative Rendering from A Single Video Paper • 2503.11647 • Published 3 days ago • 81
PLADIS: Pushing the Limits of Attention in Diffusion Models at Inference Time by Leveraging Sparsity Paper • 2503.07677 • Published 8 days ago • 66
BIMBA: Selective-Scan Compression for Long-Range Video Question Answering Paper • 2503.09590 • Published 5 days ago • 3
MoC: Mixtures of Text Chunking Learners for Retrieval-Augmented Generation System Paper • 2503.09600 • Published 5 days ago • 4
Alias-Free Latent Diffusion Models:Improving Fractional Shift Equivariance of Diffusion Latent Space Paper • 2503.09419 • Published 5 days ago • 5
Cost-Optimal Grouped-Query Attention for Long-Context LLMs Paper • 2503.09579 • Published 5 days ago • 5
When Large Vision-Language Model Meets Large Remote Sensing Imagery: Coarse-to-Fine Text-Guided Token Pruning Paper • 2503.07588 • Published 7 days ago • 7
More Documents, Same Length: Isolating the Challenge of Multiple Documents in RAG Paper • 2503.04388 • Published 11 days ago • 15
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning Paper • 2503.09516 • Published 5 days ago • 23
Reangle-A-Video: 4D Video Generation as Video-to-Video Translation Paper • 2503.09151 • Published 6 days ago • 29
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models Paper • 2503.09573 • Published 5 days ago • 54
PerCoV2: Improved Ultra-Low Bit-Rate Perceptual Image Compression with Implicit Hierarchical Masked Image Modeling Paper • 2503.09368 • Published 5 days ago • 2