Definition and the Berkeley Admissions Example
Simpson's Paradox is a statistical phenomenon in which a trend observed within individual subgroups reverses when the groups are combined. The classic example comes from 1973 UC Berkeley admissions data, where women appeared to have a lower overall acceptance rate than men, even though they had equal or higher rates within each department. The reversal occurred because women disproportionately applied to more competitive departments with lower acceptance rates across the board.
The Role of Confounding Variables
At the heart of the paradox lies the confounding variable - a third factor that influences both the apparent cause and the outcome. In the Berkeley case, the choice of department was the confounder. Ignoring it produced a misleading aggregate picture.
Techniques such as stratified analysis, regression adjustment, and propensity score matching help control for confounders. The paradox serves as a clear reminder that naive aggregation can lead to entirely wrong conclusions about causal relationships.
Occurrence in Ranking Data
Simpson's Paradox can easily arise in global rankings. For instance, a country might outperform the world average in every age group for a health metric, yet fall below the world average overall because its population skews toward an age group with inherently lower scores. Whenever you compare groups with different compositions, this paradox is a real risk.
Why Aggregation Level Matters
The level at which data is aggregated can flip conclusions entirely. Whether you look at a national ranking, a regional breakdown, or an age-stratified view, your apparent position may change dramatically. Relying on a single aggregation without checking subgroups is a common source of misinterpretation. Examining data from multiple angles is the first step toward an accurate understanding of where you truly stand.