view reply SFT-only using open-r1/codeforces produced top-tier performance? Impressive! Will you do coding RL next?
Useful Pretrain-Datasets Collection pretrain-datasets with (maybe) good quality • 21 items • Updated 2 days ago • 1
Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning? Paper • 2502.19361 • Published 16 days ago • 26
awesome-zh-corpus Collection some high-quality Chinese corpus you can find • 15 items • Updated 14 days ago