Wukong: Towards a Scaling Law for Large-Scale Recommendation Paper • 2403.02545 • Published Mar 4 • 15
Secrets of RLHF in Large Language Models Part I: PPO Paper • 2307.04964 • Published Jul 11, 2023 • 27