Dedicated Feedback and Edit Models Empower Inference-Time Scaling for Open-Ended General-Domain Tasks Paper • 2503.04378 • Published 10 days ago • 6
Llama-3.1-Nemotron-70B Collection SOTA models on Arena Hard and RewardBench as of 1 Oct 2024. • 6 items • Updated Jan 17 • 153
HelpSteer2-Preference: Complementing Ratings with Preferences Paper • 2410.01257 • Published Oct 2, 2024 • 23 • 5