Statistics Tip of the Week: Use a Box and Whiskers Plot to identify Outlier Residuals in your Regression Model
Residuals represent the Error in a Regression Model. They represent the Variation in the y variable which is not explained by the Regression Model. A Residual is the difference between a given y value in the data and the y value predicted by the Model.
Residuals must be Random. There are several kinds of non-Randomness to look for. One is unexplained Outliers. And a Box and Whiskers Plot like the one shown below can be used to identify them.
The Interquartile Range (IQR) box shows the Range of the values around the Mean which comprise 50% of the total values. In this example, the IQR is 60 - 40 = 20. Horizontal "whiskers" are drawn to extend 1.5 box-lengths on either side of the box.
Outliers are defined as those Residuals beyond these "whiskers".
Andrew A. (Andy) Jawlik is the author of the book, Statistics from A to Z -- Confusing Concepts Clarified, published by Wiley.