AI reliability, legal AI, evaluation benchmarks, citation verification, evidence infrastructure, reproducibility, provenance systems, retrieval evaluation, small language models, AI safety.