Scaling Relationship on Learning Mathematical Reasoning with Large Language Models Paper • 2308.01825 • Published Aug 3, 2023 • 21
RRHF: Rank Responses to Align Language Models with Human Feedback without tears Paper • 2304.05302 • Published Apr 11, 2023