MLGym: A New Framework and Benchmark for Advancing AI Research Agents Paper • 2502.14499 • Published 26 days ago • 182
The Big Benchmarks Collection Collection Gathering benchmark spaces on the hub (beyond the Open LLM Leaderboard) • 13 items • Updated Nov 18, 2024 • 209
Nemotron 4 340B Collection Nemotron-4: open models for Synthetic Data Generation (SDG). Includes Base, Instruct, and Reward models. • 4 items • Updated about 4 hours ago • 162