-
Training Compute-Optimal Large Language Models
Paper • 2203.15556 • Published • 10 -
Perspectives on the State and Future of Deep Learning -- 2023
Paper • 2312.09323 • Published • 5 -
MobileSAMv2: Faster Segment Anything to Everything
Paper • 2312.09579 • Published • 20 -
Point Transformer V3: Simpler, Faster, Stronger
Paper • 2312.10035 • Published • 17
Collections
Discover the best community collections!
Collections including paper arxiv:2312.10035
-
aMUSEd: An Open MUSE Reproduction
Paper • 2401.01808 • Published • 28 -
From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations
Paper • 2401.01885 • Published • 27 -
SteinDreamer: Variance Reduction for Text-to-3D Score Distillation via Stein Identity
Paper • 2401.00604 • Published • 4 -
LARP: Language-Agent Role Play for Open-World Games
Paper • 2312.17653 • Published • 31
-
SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention
Paper • 2312.07987 • Published • 41 -
Interfacing Foundation Models' Embeddings
Paper • 2312.07532 • Published • 10 -
Point Transformer V3: Simpler, Faster, Stronger
Paper • 2312.10035 • Published • 17 -
TheBloke/quantum-v0.01-GPTQ
Text Generation • Updated • 17 • 2
-
In-Context Learning Creates Task Vectors
Paper • 2310.15916 • Published • 42 -
Point Transformer V3: Simpler, Faster, Stronger
Paper • 2312.10035 • Published • 17 -
Extending Context Window of Large Language Models via Semantic Compression
Paper • 2312.09571 • Published • 12 -
PanGu-π: Enhancing Language Model Architectures via Nonlinearity Compensation
Paper • 2312.17276 • Published • 15