All Categorical - Simpson’s Paradox

(or better Yule-Simpson’s Paradox)

Case Study: Sex Discrimination in Graduate School Admissions

The famous Berkeley data on sex discrimination. In fall quarter, 1973, there were 8,442 men who applied for admission to graduate school, and 4,321 women.

Source: Freeman, D., Pisani, R., Purves, R. and Adhikiri, A. (1991) Statistics (2nd edition). WW Norton.

First we will look at the overall admittance numbers:

attach(berkeleyadmissions)
berkeleyadmissions[1:2, 1:3]

##   Overall  Sex Admitted
## 1    Men: 8442     3738
## 2  Women: 4321     1494

Let’s find the percentages:

round(c(3738/8442, 1494/4321)*100, 1)

## [1] 44.3 34.6

which shows a sizable difference in admission rates. We can also do the test:

chi.ind.test(berkeleyadmissions[1:2, 2:3])

## p value of test p=0.000

Parameters of interest: measure of association
Method of analysis: chi-square test of independence
Assumptions of Method: all expected counts greater than 5
Type I error probability \(\alpha\)=0.05
H₀: Classifications are independent = there is no difference in the admissions rates of men and women.
H_a: Classifications are dependent = there is some difference in the admissions rates of men and women.
p=0.000
0.000<0.05, we reject the null hypothesis, there is some difference in the admissions rates of men and women.

Now let’s consider the data with the majors

berM <- berkeleyadmissions[ ,5:6] 
berM

##   Men.Applied Men.Admitted
## 1         825          512
## 2         560          353
## 3         325          120
## 4         417          138
## 5         191           53
## 6         373           22

round(berM[ ,2]/berM[ ,1]*100, 2)

## [1] 62.06 63.04 36.92 33.09 27.75  5.90

berF <- berkeleyadmissions[ ,7:8] 
berF

##   Women.Applied Women.Admitted
## 1           108             89
## 2            25             17
## 3           593            202
## 4           375            131
## 5           393             94
## 6           341             24

round(berF[,2]/berF[,1]*100, 2)

## [1] 82.41 68.00 34.06 34.93 23.92  7.04

and suddenly any hint of sex discrimination is gone.

A formal hypothesis test for this is possible but outside the scope of this course.

So, we have a paradox:

we found strong evidence (p value=0.00) of a relationship between the gender of an applicant and whether or not they were admitted to the School.
when we broke down the data further by the major of the applicant, this relationship went away.

How is this possible?

Actually, we already know the answer: this is again an issue caused by confusing Cause-Effect with Latent Variable.

There is clearly a relationship between acceptance and gender. But saying it is due to sex discrimination is saying we have a cause - effect relationship. Instead we now know it is because of the latent variable Major.

Can we understand this in the Berkeley Admissions case?

Majors A and B are very popular with the men - 1385 men applied vs. 133 women. Majors A and B are also easy to get in - about 2 out of 3 of the applicants (men or women) get accepted. So although men and women have the same acceptance rate, 10 times as many men are accepted because 10 times as many applied.

Majors C-F are more popular with the women - 1346 men applied vs. 1702 women. But Majors C-F are hard to get in - about 1 in 4 of the applicants (men or women) get accepted. So these majors don’t add much to the total student body.

If in an observational study (as opposed to a clinical trial with random assignments to “treatment” and “control”) we find an relationship (association) between two variables it is usually very hard (impossible?) to decide whether it is due to a cause-effect relationship or whether there is a latent variable responsible for the relationship. In the Berkeley case it turned out that Major was a latent variable. A list of other potential latent variables includes:

Prior educational achievements
Age
Financial situation of parents

and so on.

Note that we could determine here that Majors is a latent variable explaining the relationship between Gender and Acceptance because we had the data to do so! So generally in a study you want to “measure” as many variables as possible because you won’t know ahead of time which of them might turn out to be important.