SEAP: Training-free Sparse Expert Activation Pruning Unlock the Brainpower of Large Language Models Paper • 2503.07605 • Published 2 days ago • 61
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding Paper • 2502.08946 • Published 28 days ago • 184
LRM-Zero: Training Large Reconstruction Models with Synthesized Data Paper • 2406.09371 • Published Jun 13, 2024 • 5
Toffee: Efficient Million-Scale Dataset Construction for Subject-Driven Text-to-Image Generation Paper • 2406.09305 • Published Jun 13, 2024 • 5
MLKV: Multi-Layer Key-Value Heads for Memory Efficient Transformer Decoding Paper • 2406.09297 • Published Jun 13, 2024 • 6
Real3D: Scaling Up Large Reconstruction Models with Real-World Images Paper • 2406.08479 • Published Jun 12, 2024 • 7
CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark Paper • 2406.05967 • Published Jun 10, 2024 • 6
CMC-Bench: Towards a New Paradigm of Visual Signal Compression Paper • 2406.09356 • Published Jun 13, 2024 • 5
Understanding Hallucinations in Diffusion Models through Mode Interpolation Paper • 2406.09358 • Published Jun 13, 2024 • 5
Language Model Council: Benchmarking Foundation Models on Highly Subjective Tasks by Consensus Paper • 2406.08598 • Published Jun 12, 2024 • 6
Mistral-C2F: Coarse to Fine Actor for Analytical and Reasoning Enhancement in RLHF and Effective-Merged LLMs Paper • 2406.08657 • Published Jun 12, 2024 • 10
Commonsense-T2I Challenge: Can Text-to-Image Generation Models Understand Commonsense? Paper • 2406.07546 • Published Jun 11, 2024 • 9
TC-Bench: Benchmarking Temporal Compositionality in Text-to-Video and Image-to-Video Generation Paper • 2406.08656 • Published Jun 12, 2024 • 8
Explore the Limits of Omni-modal Pretraining at Scale Paper • 2406.09412 • Published Jun 13, 2024 • 11
EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts Paper • 2406.09162 • Published Jun 13, 2024 • 14
mOSCAR: A Large-scale Multilingual and Multimodal Document-level Corpus Paper • 2406.08707 • Published Jun 13, 2024 • 16
4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities Paper • 2406.09406 • Published Jun 13, 2024 • 15
CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery Paper • 2406.08587 • Published Jun 12, 2024 • 16