RLHF/RLAIF - a felixbrock Collection

felixbrock 's Collections

llm-pretraining

OCR

dataset

cv-eval

llm-gen-ai-text

text-to-image-model

llm-performance

llm-doc-retrieval

privacy/security

RLHF/RLAIF

updated Oct 27, 2023

Efficient RLHF: Reducing the Memory Usage of PPO

Paper • 2309.00754 • Published Sep 1, 2023 • 13
Statistical Rejection Sampling Improves Preference Optimization

Paper • 2309.06657 • Published Sep 13, 2023 • 13
Are Large Language Model-based Evaluators the Solution to Scaling Up Multilingual Evaluation?

Paper • 2309.07462 • Published Sep 14, 2023 • 4
Stabilizing RLHF through Advantage Model and Selective Rehearsal

Paper • 2309.10202 • Published Sep 18, 2023 • 9
Aligning Large Multimodal Models with Factually Augmented RLHF

Paper • 2309.14525 • Published Sep 25, 2023 • 30
Safe RLHF: Safe Reinforcement Learning from Human Feedback

Paper • 2310.12773 • Published Oct 19, 2023 • 28
Contrastive Prefence Learning: Learning from Human Feedback without RL

Paper • 2310.13639 • Published Oct 20, 2023 • 24