ShortGPT: Layers in Large Language Models are More Redundant Than You Expect Paper • 2403.03853 • Published Mar 6 • 62
SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks Paper • 2402.09025 • Published Feb 14 • 6
Shortened LLaMA: A Simple Depth Pruning for Large Language Models Paper • 2402.02834 • Published Feb 5 • 14
Larimar: Large Language Models with Episodic Memory Control Paper • 2403.11901 • Published Mar 18 • 32
Qihoo-T2X: An Efficiency-Focused Diffusion Transformer via Proxy Tokens for Text-to-Any-Task Paper • 2409.04005 • Published Sep 6 • 16
Self-Discover: Large Language Models Self-Compose Reasoning Structures Paper • 2402.03620 • Published Feb 6 • 109
CAS-ViT: Convolutional Additive Self-attention Vision Transformers for Efficient Mobile Applications Paper • 2408.03703 • Published Aug 7