OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework Paper • 2404.14619 • Published Apr 22 • 126
Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning Paper • 2303.15647 • Published Mar 28, 2023 • 4
Hyper-X: A Unified Hypernetwork for Multi-Task Multilingual Transfer Paper • 2205.12148 • Published May 24, 2022 • 2
No More Adam: Learning Rate Scaling at Initialization is All You Need Paper • 2412.11768 • Published 14 days ago • 41