Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-Constraint Paper • 2312.11456 • Published Dec 18, 2023 • 1
LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models Paper • 2306.12420 • Published Jun 21, 2023 • 2
RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment Paper • 2304.06767 • Published Apr 13, 2023 • 2
SAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial Understanding Paper • 2310.15308 • Published Oct 23, 2023 • 22