What Is Simpson's Paradox
Simpson's paradox is a phenomenon where a trend that holds within every subgroup reverses when the subgroups are combined. The 1973 Berkeley admissions discrimination lawsuit provides the canonical example. Overall admission rates were 44% for males and 35% for females, appearing to favor males. Yet when examined department by department, the majority of departments admitted females at higher rates than males.
The reversal occurred because women disproportionately applied to highly competitive departments. With many women applying to departments with low overall acceptance rates, aggregating across departments made female admission rates appear lower. The cause was not discrimination but skewed application patterns. Without appropriate stratification, one can "discover" discrimination that does not exist.
Simpson's Paradox in Ranking Data
Similar reversals can occur in global rankings. For instance, the overall trend "Country A has higher life expectancy than Country B" may reverse when examined by age group. Country A might have extremely low youth mortality (excellent pediatric care) while Country B's elderly live longer.
In income rankings, the contradiction "national median income rose" while "median income fell in every age group" is entirely possible. If the proportion of high-income elderly increases, the overall median can rise even as each age group's median declines. Compositional changes in the population distort aggregate figures.
Confounding Variables - The Hidden Third Factor
The essence of Simpson's paradox is the presence of a "confounding variable" - a third variable that influences both the presumed cause and the presumed effect. Ignoring confounders produces spurious associations or conceals genuine ones.
In the Berkeley example, "department applied to" was the confounder. Gender influenced department choice, and department influenced admission rates. Examining only gender and admission rates while ignoring this confounder leads to erroneous conclusions. When interpreting ranking data, one should always ask: "Is there a third variable that could explain this apparent relationship?"
How the Choice of Aggregation Level Changes Conclusions
The fundamental lesson of Simpson's paradox is that conclusions depend on the level of aggregation. Whether to examine the whole or the subgroups, and which answer is "correct," depends on the analytical purpose and the underlying causal structure.
To investigate discrimination, department-level (stratified) analysis is appropriate. To consider university-wide resource allocation, aggregate figures may be appropriate. There is no uniquely "correct" aggregation level - it depends on how the question is framed. When viewing ranking numbers, cultivating the habit of asking "at what level was this aggregated?" and "would the conclusion change at a different level?" is essential.
Practical Approaches to Detecting the Paradox
There are practical approaches for detecting Simpson's paradox. First, whenever you observe an overall trend, verify it within subgroups. Decompose by sex, age group, region, income bracket, and other major stratification variables to check whether the trend is consistent.
Second, check whether group composition has shifted over time. Distinguish whether "the average improved" because individuals improved or because composition changed. Third, draw causal diagrams (DAGs) to visualize confounding structure. Determining which variables to condition on requires understanding the causal architecture - it cannot be decided by data alone.