new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

byAK and the research community

Mar 25

Submitted by

therem

I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders

·
7 authors

2

Submitted by

Liuff23

Video-T1: Test-Time Scaling for Video Generation

·
6 authors

1

Submitted by

VictorYuki

Position: Interactive Generative Video as Next-Generation Game Engine

·
8 authors

3

Submitted by

AndrewZeng

SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild

·
7 authors

1

Submitted by

akhaliq

Aether: Geometric-Aware Unified World Modeling

·
11 authors

Submitted by

Dvir

OmnimatteZero: Training-free Real-time Omnimatte with Pre-trained Video Diffusion Models

·
5 authors

2

Submitted by

shuaishuaicdp

Judge Anything: MLLM as a Judge Across Any Modality

·
13 authors

Submitted by

weepiess2383

CFG-Zero*: Improved Classifier-Free Guidance for Flow Matching Models

·
4 authors

2

Submitted by

akhaliq

Vision-R1: Evolving Human-Free Alignment in Large Vision-Language Models via Vision-Guided Reinforcement Learning

·
7 authors

Submitted by

iliashum

Defeating Prompt Injections by Design

·
10 authors

1

Submitted by

mdmoor

AgentRxiv: Towards Collaborative Autonomous Research

·
2 authors

2

Submitted by

akhaliq

FFN Fusion: Rethinking Sequential Computation in Large Language Models

·
18 authors

Submitted by

QizhiPei

LEMMA: Learning from Errors for MatheMatical Advancement in LLMs

·
10 authors

2

Submitted by

maincold2

Optimized Minimal 3D Gaussian Splatting

·
3 authors

2

Submitted by

Lingaaaaaaa

Training-free Diffusion Acceleration with Bottleneck Sampling

·
9 authors

2

Submitted by

cientgu

Equivariant Image Modeling

·
6 authors

Submitted by

zhangysk

Video SimpleQA: Towards Factuality Evaluation in Large Video Language Models

·
11 authors

1

Submitted by

CedPei

Feather-SQL: A Lightweight NL2SQL Framework with Dual-Model Collaboration Paradigm for Small Language Models

·
8 authors

2

Submitted by

ryoungj

Reasoning to Learn from Latent Thoughts

·
4 authors

1

Submitted by

BestWishYsh

MagicComp: Training-free Dual-Phase Refinement for Compositional Video Generation

·
8 authors

2

Submitted by

alandao

AlphaSpace: Enabling Robotic Actions through Semantic Tokenization and Symbolic Reasoning

·
3 authors

2

Submitted by

oneonlee

Typed-RAG: Type-aware Multi-Aspect Decomposition for Non-Factoid Question Answering

·
5 authors

2

Submitted by

nielsr

Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion Models

·
5 authors

Submitted by

Abdul084

Lost in Cultural Translation: Do LLMs Struggle with Math Across Cultural Contexts?

·
6 authors

2

Submitted by

akhaliq

V-Seek: Accelerating LLM Reasoning on Open-hardware Server-class RISC-V Platforms

·
5 authors

Submitted by

akhaliq

AMD-Hummingbird: Towards an Efficient Text-to-Video Model

·
6 authors

Submitted by

louisowen6

Variance Control via Weight Rescaling in LLM Pre-training

·
4 authors

2

Submitted by

zygg

Mind with Eyes: from Language Reasoning to Multimodal Reasoning

·
5 authors

2

Submitted by

akhaliq

RDTF: Resource-efficient Dual-mask Training Framework for Multi-frame Animated Sticker Generation

·
8 authors

Submitted by

zhenyupan

MetaSpatial: Reinforcing 3D Spatial Reasoning in VLMs for the Metaverse

·
2 authors

2

Submitted by

SherryXTChen

Instruct-CLIP: Improving Instruction-Guided Image Editing with Automated Data Refinement Using Contrastive Learning

·
3 authors

Submitted by

nzl-thu

CODA: Repurposing Continuous VAEs for Discrete Tokenization

·
7 authors

Submitted by

MarkChenX

Verbal Process Supervision Elicits Better Coding Agents

·
3 authors

2

Submitted by

davidserra9

Revisiting Image Fusion for Multi-Illuminant White-Balance Correction

·
6 authors

2

Submitted by

davidserra9

Rethinking Image Evaluation in Super-Resolution

·
6 authors

2

Submitted by

edodema

Human Motion Unlearning

·
5 authors

2

Submitted by

WeiDeng1999

Global-Local Tree Search for Language Guided 3D Scene Generation

·
3 authors

2

Submitted by

shawnricecake

QuartDepth: Post-Training Quantization for Real-Time Depth Estimation on the Edge

·
12 authors

Submitted by

KyanChen

DynamicVis: An Efficient and General Visual Foundation Model for Remote Sensing Image Understanding

·
6 authors

2