PERL: Parameter Efficient Reinforcement Learning from Human Feedback Paper • 2403.10704 • Published Mar 15 • 57
RewardBench: Evaluating Reward Models for Language Modeling Paper • 2403.13787 • Published Mar 20 • 21
Direct Preference Optimization: Your Language Model is Secretly a Reward Model Paper • 2305.18290 • Published May 29, 2023 • 49
DRLC: Reinforcement Learning with Dense Rewards from LLM Critic Paper • 2401.07382 • Published Jan 14 • 2