Benchmark Collection Benchmark for evaluation. Each benmark is categoried into 3 difficulty levels. • 5 items • Updated 5 days ago
Data_dimension Collection Models trained using data with different filtering strategies (difficulty, quality filtering) • 4 items • Updated 6 days ago
Benchmark Collection Benchmark for evaluation. Each benmark is categoried into 3 difficulty levels. • 5 items • Updated 5 days ago
A Preliminary Study of o1 in Medicine: Are We Closer to an AI Doctor? Paper • 2409.15277 • Published Sep 23, 2024 • 36