One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation Paper • 2410.07170 • Published Oct 9 • 15
Only-IF:Revealing the Decisive Effect of Instruction Diversity on Generalization Paper • 2410.04717 • Published Oct 7 • 17
Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction Paper • 2409.17422 • Published Sep 25 • 24 • 5
BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Competitive Large Language Model Baseline Paper • 2408.15079 • Published Aug 27 • 52 • 4
Dolphin: Long Context as a New Modality for Energy-Efficient On-Device Language Models Paper • 2408.15518 • Published Aug 28 • 42 • 4
Better Alignment with Instruction Back-and-Forth Translation Paper • 2408.04614 • Published Aug 8 • 14 • 3
Better Alignment with Instruction Back-and-Forth Translation Paper • 2408.04614 • Published Aug 8 • 14
Gemma 2: Improving Open Language Models at a Practical Size Paper • 2408.00118 • Published Jul 31 • 75 • 3