Self-Play Preference Optimization for Language Model Alignment Paper • 2405.00675 • Published 21 days ago • 18