What Does Multiple Linear Regression Look Like?

Consider once again the regression of homocysteine on B12 and folate (all logged). It's common to think of the data in terms of pairwise scatterplots. The regression equation

LHCY = 1.570602 - 0.082103 LCLC - 0.136784 LB12

is often mistakenly thought of as a kind of line. However, it is not a line, but a surface.

Each observation is a three-dimensional vector {(xi, yi, zi), i = 1,..n} [here, (LCLCi, LB12i, LHCYi)]. When plotted in a three-dimensional space, the data look like the picture to the left.

It can be difficult to appreciate a two-dimensional representation of three- dimensional data. The picture is redrawn with spikes from each observation to the plane defined by LCLC and LB12 to give a better sense of where the data lie.

The final display shows the regression surface. It is a flat plane. Predicted values are obtained by staring at the intersection of LB12 and LCLC on the LB12-LCLC plane and travelling parallel to the LHCY axis until the plane is reached (in the manner of the spike, but to the plane instead of the observation). Residuals are calculated as the distance from the observation to the plane, again travelling parallel to the LCHY axis.

The same thing happens with more that 2 predictors, but it's hard to draw a two-dimensional representation of it. With p predictors, the regression surface is a p-dimensional hyperplane in a (p+1)-dimensional space.


Copyright © 2001 Gerard E. Dallal