Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention Paper • 2404.07143 • Published Apr 10 • 93
Mora: Enabling Generalist Video Generation via A Multi-Agent Framework Paper • 2403.13248 • Published Mar 20 • 72
MC-JEPA: A Joint-Embedding Predictive Architecture for Self-Supervised Learning of Motion and Content Features Paper • 2307.12698 • Published Jul 24, 2023 • 5
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection Paper • 2403.03507 • Published Mar 6 • 176
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers Paper • 2401.11605 • Published Jan 21 • 19
Bytes Are All You Need: Transformers Operating Directly On File Bytes Paper • 2306.00238 • Published May 31, 2023 • 5