## Original survey question

Philosophers were given the following description of a study taken from the paper "Causal Judgment and Moral Judgment: Two Experiments":

Suppose subjects are presented with the following case:

The receptionist in the philosophy department keeps her desk stocked with pens. The administrative assistants are allowed to take the pens, but faculty members are supposed to buy their own.

The administrative assistants typically do take the pens. Unfortunately, so do the faculty members. The receptionist has repeatedly emailed them reminders that only administrative assistants are allowed to take the pens.

On Monday morning, one of the administrative assistants encounters Professor Smith walking past the receptionist's desk. Both take pens. Later that day, the receptionist needs to take an important message... but she has a problem. There are no pens left on her desk.

All subjects are then asked how much they agree with each of the following statements:

1. Professor Smith caused the problem

2. The administrative assistant caused the problem

Subjects responded to each question by selecting a number on a scale ranging from -3 (no agreement) to 3 (full agreement).

The philosophers in our study were then asked to respond to the following question:

On average, subjects' agreement with 1 would be

• significantly greater than their agreement with 2.
• not significantly different from their agreement with 2.
• significantly lower than their agreement with 2.
• Unable to provide an unbiased answer.

• (Note: in this and subsequent questions, respondents were told to interpret questions about significance as being about statistical significance, and to let p < .05.)

## Results

(Note: all numbers and percentages report the number of respondents who did not select the 'Unable to provide an unbiased answer' response.)

In the first part of the study ("study 1"), respondents were given no further information about the subjects of the original study. A total of 190 philosophers responded in study 1 (while 32 indicated that they could not provide unbiased responses). The 190 responses were distributed as follows (here and below, the answer which predicts the results of the original study is in bold):

total responses from study 1

% responses from study 1

significantly greater than their agreement with 2.

182

95.8%

not significantly different from their agreement with 2.

8

4.2%

significantly lower than their agreement with 2.

0

0%

In the second part of the study ("study 2"), respondents were given the above question, and told to suppose that the original study involved 20 American undergraduate unniversity students. A total of 97 philosophers responded in study 2 (while 10 indicated that they could not provide unbiased responses). The 97 responses were distributed as follows:

total responses from study 2

% responses from study 2

significantly greater than their agreement with 2.

86

88.7%

not significantly different from their agreement with 2.

7

7.2%

significantly lower than their agreement with 2.

4

4.1%

In all, a total of 287 philosophers answered this question:

total responses from combined study

% responses from combined study

significantly greater than their agreement with 2.

268

93.4%

not significantly different from their agreement with 2.

15

5.2%

significantly lower than their agreement with 2.

4

1.4%

## Comparison with original results

Knobe and Fraser describe the results of the original study:

All subjects were given the story of the professor and the administrative assistant. They were then asked to indicate whether they agreed or disagreed with the following two statements:

'Professor Smith caused the problem.'
'The administrative assistant caused the problem.'

The results showed a dramatic difference. People agreed with the statement that Professor Smith caused the problem but disagreed with the statement that the administrative assistant caused the problem.

In a footnote, they continue:

Subjects were 18 students in an introductory philosophy class at University of North Carolina-Chapel Hill. [...] Each subject rated both statements on a scale from -3 ('not at all') to +3 ('fully'), with the 0 point marked 'somewhat'. The mean rating for the statement that the professor caused the problem was 2.2; the mean for the statement that the assistant caused the problem was -1.2. This difference is statistically significant.