InfinityMATH: A Scalable Instruction Tuning Dataset in Programmatic Mathematical Reasoning Paper • 2408.07089 • Published Aug 9 • 12
HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models Paper • 2409.16191 • Published 7 days ago • 38
Training Language Models to Self-Correct via Reinforcement Learning Paper • 2409.12917 • Published 12 days ago • 127