arxiv:2412.03205
Sergei Tilga
tilgasergey
AI & ML interests
None yet
Recent Activity
authored
a paper
17 days ago
U-MATH: A University-Level Benchmark for Evaluating Mathematical Skills
in LLMs
authored
a paper
17 days ago
Beemo: Benchmark of Expert-edited Machine-generated Outputs
upvoted
a
collection
17 days ago
U-MATH and μ-MATH - University-level math evaluation