arxiv:2605.28556
Yotam Perlitz
per
AI & ML interests
None yet
Recent Activity
authored a paper about 4 hours ago
DOVE: A Large-Scale Multi-Dimensional Predictions Dataset Towards
Meaningful LLM Evaluation authored a paper about 4 hours ago
CLEAR: Error Analysis via LLM-as-a-Judge Made Easy authored a paper about 4 hours ago
General Agent Evaluation