Optimizer - a leideng Collection

leideng 's Collections

Optimizer

updated about 10 hours ago

Muon is Scalable for LLM Training

Paper • 2502.16982 • Published Feb 24, 2025 • 12