How Regression to the Mean Was Discovered

In 1886, Francis Galton investigated the relationship between parents' heights and their children's heights, discovering that tall parents tended to have children shorter than themselves, and short parents tended to have children taller than themselves. Extreme values regress toward the average in the next generation. Galton termed this "regression toward mediocrity," and this is the etymological origin of "regression analysis."

Regression to the mean is not a biological phenomenon but a purely statistical one. The more extreme a measured value, the more likely the next measurement will be closer to the average. This occurs universally in any data containing measurement error or random variation.

How Regression to the Mean Affects Rankings

When MyRank displays an extremely high (or low) percentile, that value may incorporate measurement error. For example, weighing yourself before breakfast versus after dinner produces a 1-2 kg difference, shifting your BMI ranking position by several percentile points.

If your ranking reads "top 5%" on one day, it might read "top 8%" on another. Extreme results tend to "revert" toward the mean on subsequent measurements. This is not because your ability or condition changed, but because random variation resolved differently. Avoid over-interpreting any single measurement result.

The Causal Illusions Created by Regression to the Mean

Without understanding regression to the mean, people perceive causal relationships that do not exist. A classic example is the teacher's heuristic that "scolding improves performance and praise worsens it." After an extremely poor performance, scores tend to improve regardless of scolding (regression to the mean), and after an extremely good performance, scores tend to decline regardless of praise. The intervention's effect cannot be separated from statistical regression.

Medicine faces the same problem. If treatment begins when symptoms are at their worst, symptoms are likely to improve even without treatment (regression to the mean). Measuring the true effect of treatment requires randomized controlled trials; simple before-and-after comparisons cannot disentangle treatment effects from regression to the mean.

Sports and Regression to the Mean

The "sophomore slump" is a textbook example of regression to the mean. Rookie of the Year winners frequently see their statistics decline the following season. This is not a "jinx" but a statistically predictable phenomenon. Winning Rookie of the Year requires performance where skill and favorable random variation coincide; the following year, that favorable variation simply averages out.

Similarly, the "Sports Illustrated cover jinx" - athletes appearing on the cover subsequently declining - is explained by regression to the mean. Cover selection occurs when recent performance is extremely good, and subsequent "normal" performance is a statistical inevitability.

Reading Data with Regression to the Mean in Mind

When interpreting ranking data, the following principles help avoid erroneous conclusions. First, do not over-trust a single extreme measurement. Using the average of multiple measurements reduces the influence of random variation.

Second, when evaluating "change," mentally subtract regression to the mean. Change from an extreme value may be statistical inevitability rather than the effect of an intervention. Third, if you select the most extreme individuals from a group and track them over time, you will almost certainly observe "deterioration." This is the combined effect of selection bias and regression to the mean, not actual decline.

平均への回帰 - 極端なランキング結果が次回「悪化」する理由

How Regression to the Mean Was Discovered

How Regression to the Mean Affects Rankings

The Causal Illusions Created by Regression to the Mean

Sports and Regression to the Mean

Reading Data with Regression to the Mean in Mind

関連記事

関連用語