PhysicsGen: Can Generative Models Learn from Images to Predict Complex Physical Relations? Paper • 2503.05333 • Published 7 days ago • 1
Multimodal Language Modeling for High-Accuracy Single Cell Transcriptomics Analysis and Generation Paper • 2503.09427 • Published 1 day ago • 3
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning Paper • 2503.09516 • Published 1 day ago • 4
More Documents, Same Length: Isolating the Challenge of Multiple Documents in RAG Paper • 2503.04388 • Published 8 days ago • 12
GTR: Guided Thought Reinforcement Prevents Thought Collapse in RL-based VLM Agent Training Paper • 2503.08525 • Published 3 days ago • 13
VisualSimpleQA: A Benchmark for Decoupled Evaluation of Large Vision-Language Models in Fact-Seeking Question Answering Paper • 2503.06492 • Published 5 days ago • 9
CineBrain: A Large-Scale Multi-Modal Brain Dataset During Naturalistic Audiovisual Narrative Processing Paper • 2503.06940 • Published 4 days ago • 11
OmniMamba: Efficient and Unified Multimodal Understanding and Generation via State Space Models Paper • 2503.08686 • Published 3 days ago • 14
Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning Paper • 2503.07572 • Published 4 days ago • 31
Gemini Embedding: Generalizable Embeddings from Gemini Paper • 2503.07891 • Published 3 days ago • 24
MagicInfinite: Generating Infinite Talking Videos with Your Words and Voice Paper • 2503.05978 • Published 6 days ago • 30
UniF^2ace: Fine-grained Face Understanding and Generation with Unified Multimodal Models Paper • 2503.08120 • Published 3 days ago • 27
YuE: Scaling Open Foundation Models for Long-Form Music Generation Paper • 2503.08638 • Published 3 days ago • 53
LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL Paper • 2503.07536 • Published 4 days ago • 71
Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia Paper • 2503.07920 • Published 3 days ago • 89
ProBench: Judging Multimodal Foundation Models on Open-ended Multi-domain Expert Tasks Paper • 2503.06885 • Published 4 days ago • 3