Light-R1 Collection Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond • 7 items • Updated 26 days ago • 11
FuseO1-Preview Collection System-II Reasoning Fusion of LLMs • 11 items • Updated about 5 hours ago • 21
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published Jan 8 • 275