Feather the Throttle: Revisiting Visual Token Pruning for Vision-Language Model Acceleration Paper • 2412.13180 • Published 5 days ago • 12
Compressed Chain of Thought: Efficient Reasoning Through Dense Representations Paper • 2412.13171 • Published 5 days ago • 30
Multi-Dimensional Insights: Benchmarking Real-World Personalization in Large Multimodal Models Paper • 2412.12606 • Published 6 days ago • 40
OmniEval: An Omnidirectional and Automatic RAG Evaluation Benchmark in Financial Domain Paper • 2412.13018 • Published 5 days ago • 39
AntiLeak-Bench: Preventing Data Contamination by Automatically Constructing Benchmarks with Updated Real-World Knowledge Paper • 2412.13670 • Published 5 days ago • 4
RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment Paper • 2412.13746 • Published 4 days ago • 8
ChatDiT: A Training-Free Baseline for Task-Agnostic Free-Form Chatting with Diffusion Transformers Paper • 2412.12571 • Published 6 days ago • 7
Autoregressive Video Generation without Vector Quantization Paper • 2412.14169 • Published 4 days ago • 12
FastVLM: Efficient Vision Encoding for Vision Language Models Paper • 2412.13303 • Published 5 days ago • 12
No More Adam: Learning Rate Scaling at Initialization is All You Need Paper • 2412.11768 • Published 6 days ago • 37
TOMG-Bench: Evaluating LLMs on Text-based Open Molecule Generation Paper • 2412.14642 • Published 4 days ago • 4
SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints Paper • 2412.07760 • Published 12 days ago • 49
ChatGen: Automatic Text-to-Image Generation From FreeStyle Chatting Paper • 2411.17176 • Published 27 days ago • 22
TryOffDiff: Virtual-Try-Off via High-Fidelity Garment Reconstruction using Diffusion Models Paper • 2411.18350 • Published 25 days ago • 22
Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning Paper • 2411.18203 • Published 26 days ago • 30