Chain-of-Experts: Unlocking the Communication Power of Mixture-of-Experts Models Paper • 2506.18945 • Published 24 days ago • 39
Saffron-1: Towards an Inference Scaling Paradigm for LLM Safety Assurance Paper • 2506.06444 • Published Jun 6 • 73
MiCRo: Mixture Modeling and Context-aware Routing for Personalized Preference Learning Paper • 2505.24846 • Published May 30 • 15
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models Paper • 2505.24864 • Published May 30 • 133
CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training Paper • 2504.13161 • Published Apr 17 • 92
Predictive Data Selection: The Data That Predicts Is the Data That Teaches Paper • 2503.00808 • Published Mar 2 • 57
SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training Paper • 2501.06842 • Published Jan 12 • 16
Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN Paper • 2412.13795 • Published Dec 18, 2024 • 20
TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts Paper • 2407.03203 • Published Jul 3, 2024 • 12
Accelerated Convergence of Stochastic Heavy Ball Method under Anisotropic Gradient Noise Paper • 2312.14567 • Published Dec 22, 2023 • 1
Accelerated Convergence of Stochastic Heavy Ball Method under Anisotropic Gradient Noise Paper • 2312.14567 • Published Dec 22, 2023 • 1
LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models Paper • 2306.12420 • Published Jun 21, 2023 • 2
RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment Paper • 2304.06767 • Published Apr 13, 2023 • 2