Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge Paper • 2407.19594 • Published Jul 28 • 20
From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline Paper • 2406.11939 • Published Jun 17 • 6