Papers
arxiv:2605.28911

Political Neutrality as Balanced Approval: A Large-Scale Human Evaluation of AI Responses

Published on May 27
Authors:
,
,
,
,

Abstract

A new definition and benchmark for AI political neutrality is proposed, tested through a large-scale user study measuring approval across opposing political viewpoints on controversial issues.

As AI systems increasingly shape political views, defining and evaluating AI political neutrality is an urgent problem. Here, we propose a new definition of AI political neutrality and design a large-scale user study to test it, releasing a new dataset PARETO with 7,434 participants and 208,152 evaluations of AI responses. Our definition follows a simple principle grounded in political theory: when asked about a controversial issue, an AI model should generate responses that maximize approval across groups with opposing viewpoints, while balancing approval between groups. This definition allows empirical testing of whether an AI response is "neutral" and generalizes to any political context without pre-supposing a single left-right axis of division. We construct a benchmark of controversial U.S. issues, with prompts sourced from politically charged questions on Reddit and responses from frontier AI models, and recruit human participants to rate AI responses. Across all 20 issues, we find that it is possible for AI responses to achieve high rates of approval on both sides, even as those sides disagree strongly with each other on the substance of the issues. We also find that default responses lean liberal for GPT, Gemini, Claude, and Llama, but not Grok, and that user prompts with political charges are harder to respond to than neutral prompts. This work introduces a rigorous definition and benchmark of AI political neutrality, and a dataset to measure progress toward it.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2605.28911
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2605.28911 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2605.28911 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2605.28911 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.