-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 140 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 10 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 48 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 44
Collections
Discover the best community collections!
Collections including paper arxiv:2403.03507
-
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Paper • 2312.00752 • Published • 135 -
Elucidating the Design Space of Diffusion-Based Generative Models
Paper • 2206.00364 • Published • 13 -
GLU Variants Improve Transformer
Paper • 2002.05202 • Published • 1 -
StarCoder 2 and The Stack v2: The Next Generation
Paper • 2402.19173 • Published • 130
-
Rho-1: Not All Tokens Are What You Need
Paper • 2404.07965 • Published • 83 -
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time
Paper • 2404.10667 • Published • 13 -
Instruction-tuned Language Models are Better Knowledge Learners
Paper • 2402.12847 • Published • 24 -
DoRA: Weight-Decomposed Low-Rank Adaptation
Paper • 2402.09353 • Published • 24
-
Jamba: A Hybrid Transformer-Mamba Language Model
Paper • 2403.19887 • Published • 101 -
sDPO: Don't Use Your Data All at Once
Paper • 2403.19270 • Published • 38 -
ViTAR: Vision Transformer with Any Resolution
Paper • 2403.18361 • Published • 50 -
Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models
Paper • 2403.18814 • Published • 42