by Michael Stastny.
Imagine you want to conduct a study on the effectiveness of punishment and reward on flight training. You praise trainees for good landings and reprimand them for near crashes, and observe whether this has any effects.
After spending some time on the airfield your data indicate trainees did worse on their next flight if they had just been praised, and fly better just after being yelled at.
Then you have a conversation with the flight instructor who does the written examinations. He tells you that experience shows students who do well on midterms slack off on their final exam, while students who do badly generally get better.
What does this mean? Do students’ test scores tend to revert to some mean?
If you answered yes, you’ve just committed the famous “Galton’s fallacy”.
What did the 19th century statistician and geneticist Sir Francis Galton, Charles Darwin’s cousin, do wrong?
Well, he plotted the height of fathers against the height of their sons and discovered sons of tall fathers tended to be tall, but on average not as tall as their fathers. Similarly, sons of short fathers tended to be short, but on average not as short as their fathers. He—and this was his mistake—was immediately concerned that the sons of tall fathers are regressing into a pool of mediocrity along with the sons of everybody else.
So what’s wrong with that? Well, height tends to be normally distributed—i.e., the distribution takes a bell-shaped form—and sons’ heights are, due to heredity, correlated with the father’s height. So there is a linear dependence between fathers’ and the sons’ heights.
To see why this leads to a fallacy, let’s do a simulation where we generate, loosely speaking, 2*250 standard normally distributed random variables with a correlation coefficient of 0.5 and plot them against each other.
The resulting scatterplot is similar to what Francis Galton was looking at about 120 years ago. In this case, just imagine the average height was rescaled to zero in the plot:
The blue line is the regression line and the red line is the 45 degree line (standard deviation line). Galton expected the regression line to be the 45 degree line—that is, he expected a father with a height of 2 on the x-axis should, on average, have a son with a height of 2 on the y-axis. But because of the correlation between x and y, the regression line shows that a father with a height of 2 is expected to have a son of height 1.
The technical explanation is that the slope you get from regressing two normally distributed random variables with equal variances against each other is the correlation coefficient, and therefore a father with height 2 would give us:
1 = 0.5*2,
since the regression equation is y = 0.5*x.
So this appearance of “regressing to the mean” is a statistical mirage due to regressing two random but correlated variables upon each other. Mistakenly attaching some special meaning to this phenomenon is the technical explanation of Galton’s fallacy.
The non-technical explanation is a little more tricky. Let’s go back to our first two test-taking examples at the flight school.
First, set up the following model:
Y = T + e.
“Y” is the test score you actually receive on a certain exam. “T” is the “true” test score—the score you actually deserve—and “e” is a chance error due to, say, being sick before the test or lucky guessing on exam day.
Now assume that the distribution of true scores follows a normal distribution with a mean of 100 and a standard deviation of 15. Suppose further that the standard error, “e”, is either -5 or 5 with equal probability.
Now, if someone scores Y = 140 on his test, there are two possible explanations: either his true score is 135 (= 140 – 5) or 145 (= 140 + 5). The first outcome is, of course, more likely than the second, since “T” is assumed to be normally distributed and the more an outcome deviates from the mean, the more unlikely it is to occur. If you plot the test scores of the first exam against the test scores of the second exam, you will notice that those with very low scores on the first exam will see their average move up toward the overall mean (some of those who had a negative chance error will have a positive chance error this time) while those with high scores on their first exam will see their scores move down.
That’s what gives the appearance of a “regression to the mean.” And the same explanation applies to the father-son sample: a father with a height of 2 (which is 2 standard deviations taller than the average father) does not have on average a son with an equally extreme height.
For more on Galton’s fallacy, I highly recommend reading “Galton’s Fallacy and Tests of the Convergence Hypothesis” by Danny Quah—a well-known paper among economists, especially growth theorists.
Update: Via email a reader writes, “If Galton just used orthogonal (also called type II or Deming) regression to minimize the sum of squared ‘perpendicular’ distances betweeen the points and the fitted line, the slope would have been very close to one. Though a small amount of publications discuss such orthogonal regression for measurement devices with error existing for the X axis variable, this same physiological data analysis should also use the same methods.” More on Galton and orthogonal regression here.