-
How to Train Data-Efficient LLMs
Paper • 2402.09668 • Published • 33 -
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 69 -
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Paper • 2403.03507 • Published • 172 -
MathScale: Scaling Instruction Tuning for Mathematical Reasoning
Paper • 2403.02884 • Published • 14
peng
superpeng
·
AI & ML interests
None yet
Organizations
None yet
Collections
4
models
None public yet
datasets
None public yet