Scaling Synthetic Data Creation with 1,000,000,000 Personas Paper • 2406.20094 • Published 21 days ago • 84
LiveBench: A Challenging, Contamination-Free LLM Benchmark Paper • 2406.19314 • Published 22 days ago • 13
CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery Paper • 2406.08587 • Published Jun 12 • 15