Collections
Discover the best community collections!
Collections including paper arxiv:2305.18290
-
Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge
Paper • 1803.05457 • Published • 2 -
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Paper • 2404.14219 • Published • 237 -
Large Language Models are Zero-Shot Reasoners
Paper • 2205.11916 • Published • 1 -
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework
Paper • 2404.14619 • Published • 120
-
A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity
Paper • 2401.01967 • Published -
Secrets of RLHF in Large Language Models Part I: PPO
Paper • 2307.04964 • Published • 26 -
Zephyr: Direct Distillation of LM Alignment
Paper • 2310.16944 • Published • 116 -
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders
Paper • 2404.05961 • Published • 62
-
A General Theoretical Paradigm to Understand Learning from Human Preferences
Paper • 2310.12036 • Published • 11 -
ORPO: Monolithic Preference Optimization without Reference Model
Paper • 2403.07691 • Published • 57 -
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 37
-
Yi: Open Foundation Models by 01.AI
Paper • 2403.04652 • Published • 59 -
A Survey on Data Selection for Language Models
Paper • 2402.16827 • Published • 3 -
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
Paper • 2402.00159 • Published • 55 -
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only
Paper • 2306.01116 • Published • 28
-
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 37 -
HyperCLOVA X Technical Report
Paper • 2404.01954 • Published • 16 -
Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization
Paper • 2404.09956 • Published • 10 -
Learn Your Reference Model for Real Good Alignment
Paper • 2404.09656 • Published • 80