Collections
Discover the best community collections!
Collections including paper arxiv:2309.10202
-
Efficient RLHF: Reducing the Memory Usage of PPO
Paper • 2309.00754 • Published • 13 -
Statistical Rejection Sampling Improves Preference Optimization
Paper • 2309.06657 • Published • 13 -
Are Large Language Model-based Evaluators the Solution to Scaling Up Multilingual Evaluation?
Paper • 2309.07462 • Published • 4 -
Stabilizing RLHF through Advantage Model and Selective Rehearsal
Paper • 2309.10202 • Published • 9
-
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
Paper • 2309.04662 • Published • 22 -
Neurons in Large Language Models: Dead, N-gram, Positional
Paper • 2309.04827 • Published • 16 -
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs
Paper • 2309.05516 • Published • 9 -
DrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs
Paper • 2309.03907 • Published • 10