-
Moral Foundations of Large Language Models
Paper • 2310.15337 • Published • 1 -
Specific versus General Principles for Constitutional AI
Paper • 2310.13798 • Published • 2 -
Contrastive Prefence Learning: Learning from Human Feedback without RL
Paper • 2310.13639 • Published • 21 -
RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
Paper • 2309.00267 • Published • 45
Collections
Discover the best community collections!
Collections including paper arxiv:2310.12773
-
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 93 -
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
Paper • 2310.11511 • Published • 61 -
In-Context Learning Creates Task Vectors
Paper • 2310.15916 • Published • 39 -
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 39
-
Safe RLHF: Safe Reinforcement Learning from Human Feedback
Paper • 2310.12773 • Published • 26 -
The Generative AI Paradox: "What It Can Create, It May Not Understand"
Paper • 2311.00059 • Published • 17 -
LoRA Fine-tuning Efficiently Undoes Safety Training in Llama 2-Chat 70B
Paper • 2310.20624 • Published • 12 -
Moral Foundations of Large Language Models
Paper • 2310.15337 • Published • 1
-
Secrets of RLHF in Large Language Models Part I: PPO
Paper • 2307.04964 • Published • 26 -
Safe RLHF: Safe Reinforcement Learning from Human Feedback
Paper • 2310.12773 • Published • 26 -
Stabilizing RLHF through Advantage Model and Selective Rehearsal
Paper • 2309.10202 • Published • 9 -
Pairwise Proximal Policy Optimization: Harnessing Relative Feedback for LLM Alignment
Paper • 2310.00212 • Published • 2
-
Chain-of-Verification Reduces Hallucination in Large Language Models
Paper • 2309.11495 • Published • 37 -
EIPE-text: Evaluation-Guided Iterative Plan Extraction for Long-Form Narrative Text Generation
Paper • 2310.08185 • Published • 6 -
The Consensus Game: Language Model Generation via Equilibrium Search
Paper • 2310.09139 • Published • 11 -
In-Context Pretraining: Language Modeling Beyond Document Boundaries
Paper • 2310.10638 • Published • 26
-
Efficient RLHF: Reducing the Memory Usage of PPO
Paper • 2309.00754 • Published • 13 -
Statistical Rejection Sampling Improves Preference Optimization
Paper • 2309.06657 • Published • 12 -
Are Large Language Model-based Evaluators the Solution to Scaling Up Multilingual Evaluation?
Paper • 2309.07462 • Published • 3 -
Stabilizing RLHF through Advantage Model and Selective Rehearsal
Paper • 2309.10202 • Published • 9