arxiv:2309.13356

Exploring Large Language Models' Cognitive Moral Development through Defining Issues Test

Published on Sep 23, 2023

· Submitted by

akhaliq on Sep 26, 2023

#1 Paper of the day

Upvote

Authors:

Kumar Tanmay ,

Aditi Khandelwal ,

Utkarsh Agarwal ,

Monojit Choudhury

Abstract

The development of large language models has instilled widespread interest among the researchers to understand their inherent reasoning and problem-solving capabilities. Despite good amount of research going on to elucidate these capabilities, there is a still an appreciable gap in understanding moral development and judgments of these models. The current approaches of evaluating the ethical reasoning abilities of these models as a classification task pose numerous inaccuracies because of over-simplification. In this study, we built a psychological connection by bridging two disparate fields-human psychology and AI. We proposed an effective evaluation framework which can help to delineate the model's ethical reasoning ability in terms of moral consistency and Kohlberg's moral development stages with the help of Psychometric Assessment Tool-Defining Issues Test.

View arXiv page View PDF Add to collection

Community

mikelabs

Sep 27, 2023

Ok, this is a fun one because it touches on a big question, which is "can AI be moral?" My highlights:

Researchers from Microsoft have proposed using a psychological assessment tool called the Defining Issues Test (DIT) to evaluate the moral reasoning capabilities of large language models (LLMs) like GPT-3, ChatGPT, etc.

The DIT presents moral dilemmas and has subjects rate and rank the importance of various ethical considerations related to the dilemma. It allows quantifying the sophistication of moral thinking through a P-score.

In this new paper, the researchers tested prominent LLMs with adapted DIT prompts containing AI-relevant moral scenarios.

Key findings:

Large models like GPT-3 failed to comprehend prompts and scored near random baseline in moral reasoning.
ChatGPT, Text-davinci-003 and GPT-4 showed coherent moral reasoning with above-random P-scores.
Surprisingly, the smaller 70B LlamaChat model outscored larger models in its P-score, demonstrating advanced ethics understanding is possible without massive parameters.
The models operated mostly at intermediate conventional levels as per Kohlberg's moral development theory. No model exhibited highly mature moral reasoning.

I think this is an interesting framework to evaluate and improve LLMs' moral intelligence before deploying them into sensitive real-world environments - to the extent that a model can be said to possess moral intelligence (or, seem to possess it?).

Here's a link to my full summary with a lot more background on Kohlberg's model (had to read up on it since I didn't study psych).

SandeepSilwal

Sep 27, 2023

I'm curious ... did GPT write this paper (or at least the intro)? There is so much verbatim text repetition in the paper, almost to a comical degree. This is directly from the paper:

"For example, a seemingly controversial statement by the Dalai Lama, "suck my tongue," is a Tibetan expression but stirred global debate"

"For example, a statement made by the Dalai Lama, which may appear unconventional to some, is actually a Tibetan expression that generated global attention"

" Similarly, the swastika, associated with hatred in the West, symbolizes prosperity in Hinduism. "

"Likewise, the swastika, commonly associated with negativity in the West, holds a symbol of prosperity in Hinduism."

"The concept of "value pluralism" underscores the absence of a universal set of values applicable in all situations, further complicated by extreme variations in people’s preferences" (repeated verbatim X2)

"To address these challenges and risks, several works have proposed ethical frameworks, principles, guidelines, methods, and tools for designing, developing, evaluating, and deploying LLMs in a responsible and ethical manner." (repeated verbatim X2)

Also from the paper: "Prisoner dilemma - Should a man who escaped from prison but has since been leading an exemplary life be reported to authorities?" This is not the prisoner's dilemma.

rikhav

Sep 27, 2023

The limitation you found with GPT3.5 and davinci-002 is an interesting one: given a list of options, it "chose" ones at particular indices regardless of how you permuted the list of options.

For GPT4 and other models, did the answer to question 3 correspond to the same actual statement regardless of permutation? Why did you limit evaluation to "8 specific orders" of the statements, as opposed to fully randomly shuffling? How did you pick the orders to use?