new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

byAK and the research community

Mar 21

Submitted by

apryc1

One-Step Residual Shifting Diffusion for Image Super-Resolution via Distillation

·
9 authors

2

Submitted by

Asaf-Yehudai

Survey on Evaluation of LLM-based Agents

·
8 authors

2

Submitted by

yangsui

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

·
12 authors

2

Submitted by

ZeqiangLai

Unleashing Vecset Diffusion Model for Fast Shape Generation

·
13 authors

3

Submitted by

quickjkee

Scale-wise Distillation of Diffusion Models

·
4 authors

4

Submitted by

richardaecn

Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning

·
45 authors

Submitted by

zhwang4ai

JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse

·
5 authors

2

Submitted by

MingleiShi

DiffMoE: Dynamic Token Selection for Scalable Diffusion Transformers

·
13 authors

5

Submitted by

philschmid

Why Do Multi-Agent LLM Systems Fail?

·
13 authors

Submitted by

akhaliq

InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity

·
6 authors

Submitted by

Huan-WhoRegisteredMyName

Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models

·
5 authors

Submitted by

quyanh

Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't

·
2 authors

2

Submitted by

akhaliq

Fin-R1: A Large Language Model for Financial Reasoning through Reinforcement Learning

·
16 authors

Submitted by

QizhiPei

MathFusion: Enhancing Mathematic Problem-solving of LLM through Instruction Fusion

·
9 authors

2

Submitted by

akhaliq

SynCity: Training-Free Generation of 3D Worlds

·
5 authors

Submitted by

zorik

Inside-Out: Hidden Factual Knowledge in LLMs

·
8 authors

1

Submitted by

DyrusQZ

LHM: Large Animatable Human Reconstruction Model from a Single Image in Seconds

·
11 authors

2

Submitted by

xueyanz

M3: 3D-Spatial MultiModal Memory

·
7 authors

Submitted by

adamdad

1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering

·
4 authors

Submitted by

Ningyu

CaKE: Circuit-aware Editing Enables Generalizable Knowledge Learners

·
7 authors

2

Submitted by

roseannelexie

Ultra-Resolution Adaptation with Ease

·
4 authors

Submitted by

mathfinder

Expert Race: A Flexible Routing Strategy for Scaling Diffusion Transformer with Mixture of Experts

·
7 authors

Submitted by

akhaliq

XAttention: Block Sparse Attention with Antidiagonal Scoring

·
5 authors

Submitted by

cientgu

Tokenize Image as a Set

·
4 authors

Submitted by

akhaliq

MagicMotion: Controllable Video Generation with Dense-to-Sparse Trajectory Guidance

·
6 authors

Submitted by

rexleeppp

NuiScene: Exploring Efficient Generation of Unbounded Outdoor Scenes

·
3 authors

2

Submitted by

Sarim-Hash

SALT: Singular Value Adaptation with Low-Rank Transformation

·
6 authors

2

Submitted by

zhenglin

Zero-1-to-A: Zero-Shot One Image to Animatable Head Avatars Using Video Diffusion

·
4 authors

2

Submitted by

pierrechambon

BigO(Bench) -- Can LLMs Generate Code with Controlled Time and Space Complexity?

·
4 authors

2

Submitted by

zhongwenxu

Agents Play Thousands of 3D Video Games

·
7 authors

2

Submitted by

kpzhang996

CLS-RL: Image Classification with Rule-Based Reinforcement Learning

·
5 authors

Submitted by

lyc0930

Towards Unified Latent Space for 3D Molecular Latent Diffusion Modeling

·
7 authors

Submitted by

Gofinge

Sonata: Self-Supervised Learning of Reliable Point Representations

·
10 authors

2

Submitted by

guolinke

Uni-3DAR: Unified 3D Generation and Understanding via Autoregression on Compressed Spatial Tokens

·
8 authors

2

Submitted by

ynhe

Make Your Training Flexible: Towards Deployment-Efficient Video Models

·
6 authors

Submitted by

c-juhwan

See-Saw Modality Balance: See Gradient, and Sew Impaired Vision-Language Balance to Mitigate Dominant Modality Bias

·
5 authors

2

Submitted by

BestWishYsh

MagicID: Hybrid Preference Optimization for ID-Consistent and Dynamic-Preserved Video Customization

·
7 authors

Submitted by

lyx97

UVE: Are MLLMs Unified Evaluators for AI-Generated Videos?

·
7 authors

Submitted by

kpzhang996

Improving Autoregressive Image Generation through Coarse-to-Fine Token Prediction

·
3 authors

2

Submitted by

UVSKKR

Deceptive Humor: A Synthetic Multilingual Benchmark Dataset for Bridging Fabricated Claims with Humorous Content

·
3 authors

2

Submitted by

HJGO

VideoRFSplat: Direct Scene-Level Text-to-3D Gaussian Splatting Generation with Flexible Pose and Multi-View Joint Modeling

·
6 authors

2

Submitted by

lxxiao

MotionStreamer: Streaming Motion Generation via Diffusion-based Autoregressive Model in Causal Latent Space

·
10 authors

2

Submitted by

MAGAer13

Painting with Words: Elevating Detailed Image Captioning with Benchmark and Alignment Learning

·
5 authors

Submitted by

ab9mamun

AIMI: Leveraging Future Knowledge and Personalization in Sparse Event Forecasting for Treatment Adherence

·
3 authors

2

Submitted by

Zilence006

Where do Large Vision-Language Models Look at when Answering Questions?

·
9 authors

2

Submitted by

potamides

TikZero: Zero-Shot Text-Guided Graphics Program Synthesis

·
8 authors

Submitted by

wljungbergh

GASP: Unifying Geometric and Semantic Self-Supervised Pre-training for Autonomous Driving

·
9 authors

Submitted by

Devy1

Why Personalizing Deep Learning-Based Code Completion Tools Matters

·
3 authors

2