AI agents, agent evaluation, promotion gates, synthetic evidence, formal methods, contamination-resistant evaluation, model evaluation