Collections
Discover the best community collections!
Collections including paper arxiv:2402.17764
-
Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping
Paper • 2402.14083 • Published • 43 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 585 -
Genie: Generative Interactive Environments
Paper • 2402.15391 • Published • 70 -
Humanoid Locomotion as Next Token Prediction
Paper • 2402.19469 • Published • 25
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 585 -
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Paper • 2403.03507 • Published • 181 -
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Paper • 2402.19427 • Published • 52 -
ResLoRA: Identity Residual Mapping in Low-Rank Adaption
Paper • 2402.18039 • Published • 11
-
MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs
Paper • 2402.15627 • Published • 33 -
Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts
Paper • 2402.16822 • Published • 15 -
FuseChat: Knowledge Fusion of Chat Models
Paper • 2402.16107 • Published • 36 -
Multi-LoRA Composition for Image Generation
Paper • 2402.16843 • Published • 28