What Is Survivorship Bias
During World War II, the U.S. military examined bullet holes on returning bombers and planned to reinforce those damaged areas. Statistician Abraham Wald objected. The areas without bullet holes on returning aircraft were precisely the locations where hits proved fatal, preventing return. This is the classic illustration of survivorship bias.
Survivorship bias is a systematic cognitive distortion that arises when only successful cases remain observable while failures vanish from the dataset. In ranking data, this bias lurks everywhere, silently shaping the conclusions we draw from seemingly objective numbers.
Survivorship Bias Hidden in Rankings
Many aspire to entrepreneurship after studying billionaire rankings. Yet these rankings exclude the millions who started businesses in the same era and failed. Concluding that entrepreneurship leads to wealth by observing only survivors is the same error as armoring bombers based solely on returning aircraft.
Income rankings suffer similarly. Data from high-income countries is relatively accurate, but conflict zones and regions of extreme poverty often lack data collection infrastructure entirely. Rankings built from "those who could be measured" systematically exclude the most disadvantaged populations.
Survivorship Bias in Health Data
Articles about "habits shared by centenarians" are perennially popular but represent textbook survivorship bias. People who practiced the same habits but died at 70 never enter the study. Traits common among survivors are not necessarily causes of longevity.
The relationship between BMI and mortality is also affected. Studies of elderly populations sometimes find that slightly overweight individuals live longer, but this occurs because thin elderly people include those who lost weight due to illness. This does not mean obesity is healthy; it reflects reverse causation contaminating the sample.
Using Rankings While Acknowledging Bias
Completely eliminating survivorship bias is impossible. However, merely recognizing its existence fundamentally changes how we interpret data. The essential questions when examining any ranking are: "Who is not included here?" and "Why are they absent?"
MyRank's rankings are no exception. Anyone using this tool has internet access, digital literacy, and leisure time, which already limits the sample to a specific stratum of the world population. Those on the far side of the digital divide are absent from the ranking's reference population. Maintaining this awareness is the core of data literacy.
Thinking Strategies to Counter Bias
Practical thinking strategies exist to counteract survivorship bias. First, imagine the counterfactual: when observing a success story, estimate how many failed under identical conditions. Second, check the base rate: use the overall success rate as your reference point rather than the success rate among those with a particular attribute.
Third, examine the selection process: consider why specific data reached you and what filtering it underwent. Success stories appearing in social media feeds are algorithmically selected for high engagement, meaning extreme and unrepresentative cases are systematically amplified.