Generative AI systems are becoming increasingly prevalent in society, producing content such as text, images, audio, and video with far-reaching implications. While the NeurIPS Broader Impact statement has notably shifted norms for AI publications to consider negative societal impact, no standard exists for how to approach these impact assessments. This workshop aims to address this critical gap by bringing together experts on evaluation science and practitioners who develop and analyze technical systems.
Building upon our previous initiatives, including the FAccT 2023 CRAFT session "Assessing the Impacts of Generative AI Systems Across Modalities and Society" and our initial "Evaluating the Social Impact of Generative AI Systems" report, we have made significant strides in this area. Through these efforts, we collaboratively developed an evaluation framework and guidance for assessing generative systems across modalities. We have since crowdsourced evaluations and analyzed gaps in literature and systemic issues around how evaluations are designed and selected.
The goal of this workshop is to share our existing findings with the NeurIPS community and collectively develop future directions for effective community-built evaluations. By fostering collaboration between experts and practitioners, we aim to create more comprehensive evaluations and develop urgently needed policy recommendations for governments and AI safety organizations.