Ada3D : Exploiting the Spatial Redundancy with Adaptive Inference for Efficient 3D Object Detection Paper • 2307.08209 • Published Jul 17, 2023 • 1
Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks Paper • 2112.15139 • Published Dec 30, 2021
PointDistiller: Structured Knowledge Distillation Towards Efficient and Compact 3D Detection Paper • 2205.11098 • Published May 23, 2022
Autoencoders as Cross-Modal Teachers: Can Pretrained 2D Image Transformers Help 3D Representation Learning? Paper • 2212.08320 • Published Dec 16, 2022
DiTFastAttn: Attention Compression for Diffusion Transformer Models Paper • 2406.08552 • Published Jun 12, 2024 • 25
Accelerating Diffusion Transformers with Token-wise Feature Caching Paper • 2410.05317 • Published Oct 5, 2024
SAVEn-Vid: Synergistic Audio-Visual Integration for Enhanced Understanding in Long Video Context Paper • 2411.16213 • Published Nov 25, 2024
Multi-Stage Vision Token Dropping: Towards Efficient Multimodal Large Language Model Paper • 2411.10803 • Published Nov 16, 2024
Compression with Global Guidance: Towards Training-free High-Resolution MLLMs Acceleration Paper • 2501.05179 • Published Jan 9
RealRAG: Retrieval-augmented Realistic Image Generation via Self-reflective Contrastive Learning Paper • 2502.00848 • Published Feb 2
Stop Looking for Important Tokens in Multimodal Language Models: Duplication Matters More Paper • 2502.11494 • Published Feb 17
ProReflow: Progressive Reflow with Decomposed Velocity Paper • 2503.04824 • Published 20 days ago • 9
Decouple-Then-Merge: Finetune Diffusion Models as Multi-Task Learning Paper • 2410.06664 • Published Oct 9, 2024 • 1
LEGION: Learning to Ground and Explain for Synthetic Image Detection Paper • 2503.15264 • Published 6 days ago • 19