Submitted by therem 94 I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders · 7 authors 2
Submitted by VictorYuki 56 Position: Interactive Generative Video as Next-Generation Game Engine · 8 authors 3
Submitted by AndrewZeng 25 SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild · 7 authors 1
Submitted by Dvir 20 OmnimatteZero: Training-free Real-time Omnimatte with Pre-trained Video Diffusion Models · 5 authors 2
Submitted by weepiess2383 16 CFG-Zero*: Improved Classifier-Free Guidance for Flow Matching Models · 4 authors 2
Submitted by akhaliq 16 Vision-R1: Evolving Human-Free Alignment in Large Vision-Language Models via Vision-Guided Reinforcement Learning · 7 authors 2
Submitted by akhaliq 14 FFN Fusion: Rethinking Sequential Computation in Large Language Models · 18 authors 3
Submitted by QizhiPei 13 LEMMA: Learning from Errors for MatheMatical Advancement in LLMs · 10 authors 2
Submitted by Lingaaaaaaa 12 Training-free Diffusion Acceleration with Bottleneck Sampling · 9 authors 2
Submitted by zhangysk 11 Video SimpleQA: Towards Factuality Evaluation in Large Video Language Models · 11 authors 1
Submitted by CedPei 10 Feather-SQL: A Lightweight NL2SQL Framework with Dual-Model Collaboration Paradigm for Small Language Models · 8 authors 2
Submitted by BestWishYsh 8 MagicComp: Training-free Dual-Phase Refinement for Compositional Video Generation · 8 authors 2
Submitted by alandao 6 AlphaSpace: Enabling Robotic Actions through Semantic Tokenization and Symbolic Reasoning · 3 authors 2
Submitted by oneonlee 6 Typed-RAG: Type-aware Multi-Aspect Decomposition for Non-Factoid Question Answering · 5 authors 2
Submitted by nielsr 5 Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion Models · 5 authors 2
Submitted by Abdul084 5 Lost in Cultural Translation: Do LLMs Struggle with Math Across Cultural Contexts? · 6 authors 2
Submitted by akhaliq 5 V-Seek: Accelerating LLM Reasoning on Open-hardware Server-class RISC-V Platforms · 5 authors 2
Submitted by akhaliq 3 RDTF: Resource-efficient Dual-mask Training Framework for Multi-frame Animated Sticker Generation · 8 authors 2
Submitted by zhenyupan 2 MetaSpatial: Reinforcing 3D Spatial Reasoning in VLMs for the Metaverse · 2 authors 2
Submitted by SherryXTChen 2 Instruct-CLIP: Improving Instruction-Guided Image Editing with Automated Data Refinement Using Contrastive Learning · 3 authors 2
Submitted by davidserra9 1 Revisiting Image Fusion for Multi-Illuminant White-Balance Correction · 6 authors 2
Submitted by WeiDeng1999 - Global-Local Tree Search for Language Guided 3D Scene Generation · 3 authors 2
Submitted by shawnricecake - QuartDepth: Post-Training Quantization for Real-Time Depth Estimation on the Edge · 12 authors 2
Submitted by KyanChen - DynamicVis: An Efficient and General Visual Foundation Model for Remote Sensing Image Understanding · 6 authors 2