Failed to reproduce results with CoT on MMLU

#3
by ShadowYing - opened

Problem Statement

Failed to reproduce results with chain-of-thought prompting on MMLU even using the prompts provided by the paper (prompts can be found at https://github.com/jasonwei20/flan-2).

On more than 10 tasks I randomly chose, I got lower accuracies than reported, e.g., on the philosophy task, the reported accuracy is 55.9%, while I got 50.0%.

Experiment Setups

Example Input from the Philosophy Task

The following are multiple choice questions (with answers) about philosophy.

Q: The study of reality in the broadest sense, an inquiry into the elemental nature of the universe and the things in it, is known as _____.
(A) metaphysics (B) epistemology (C) quantum physics (D) axiology
A: Let's think step by step. We refer to Wikipedia articles on philosophy for help. Among the options, only metaphysics studies the nature of reality and existence. The answer is (A).

Q: According to Moore’s “ideal utilitarianism,” the right action is the one that brings about the greatest amount of:
(A) pleasure. (B) happiness. (C) good. (D) virtue.
A: Let's think step by step. We refer to Wikipedia articles on philosophy for help. Moore's "ideal utilitarianism" states that one's actions should maximize intrinsic goods. The answer is (C).

Q: Before Tolstoy's Christian conversion, what was his perspective on the meaning of life?
(A) optimist (B) satisfied (C) nominally religious (D) pessimist
A: Let's think step by step. We refer to Wikipedia articles on philosophy for help. Before his conversion, Tolstoy feels that life was uncertain, which is a pessimist's point of view. The answer is (D).

Q: According to d'Holbach, people always act according to _____.
(A) free choices (B) dictates of the soul (C) necessary natural laws (D) undetermined will
A: Let's think step by step. We refer to Wikipedia articles on philosophy for help. d'Holbach believes that people act according to necessary laws, and it proves nothing about people's free will. The answer is (C).

Q: Psychological egoism is:
(A) an ethical theory about how we ought to behave. (B) a generalization concerning the way people tend to behave. (C) a claim about human nature and the ways people are capable of behaving. (D) none of the above.
A: Let's think step by step. We refer to Wikipedia articles on philosophy for help. Psychological egoism suggests that one behaves based on what makes one feels good, hence it is a claim about human nature and how humans are capable of behaving. The answer is (C).

Q: Brandt claims that act-utilitarianism:
(A) has implausible consequences. (B) gives rise to moral dilemmas. (C) is self-contradictory. (D) all of the above.
A:

In fact, I also failed to reproduce results using direct prompting.

Sign up or log in to comment