Residuals represent the Error in a Regression Model. They represent the Variation in the y variable which is not explained by the Regression Model. A Residual is the difference between a given y value in the data and the y value predicted by the Model.Residuals must be Random. There are several kinds of non-Randomness to look for. One is unexplained Outliers. And a Box and Whiskers Plot like the one shown below can be used to identify them. The Interquartile Range (IQR) box shows the Range of the values around the Mean which comprise 50% of the total values. In this example, the IQR is 60 - 40 = 20. Horizontal "whiskers" are drawn to extend 1.5 box-lengths on either side of the box.
Outliers are defined as those Residuals beyond these "whiskers".
0 Comments
These graphs show the difference between a Distribution that has a Discrete data and a Discrete stairstep Probability graph compared to a Distribution with Continuous data and a Continuous smooth curve. For the Discrete data Distribution, the values of the Variable X can only be non-negative integers, because they are Counts. There is no Probability shown for 1.5, for example, because 1.5 is not an integer, and so it is not a legitimate value for X. The Probabilities for Discrete data Distribution are shown as separate columns. There is nothing between the columns, because there are no values on the horizontal axis between the individual integers.For Continuous Distributions, values of horizontal-axis Variable are real numbers, and there are an infinite number of them between any two integers. Continuous data are also called Measurement data; examples are length, weight, pressure, etc. The Probabilities for Continuous Distributions are infinitesimal points on smooth curves. For the first six Distributions described in the table above, the data used to create the values on the horizontal axis come from a single Sample or Population or Process. And the data are either Discrete or Continuous.
The F and Chi-Square (𝜒2) Distributions are hybrids. Their horizontal axis Variable is calculated from a ratio of two numbers, and the source data don’t have to be one type or another. Being a ratio, the horizontal axis Variable (F or 𝜒2) is Continuous. The Probability curve is smooth and Continuous. For more, see my YouTube video Probability Distributions -- Part 1 (of 3): What They Are. There are also videos on the F Distribution and the Chi-Square Distribution. See the Videos page of this website for the latest status of available and planned videos.In our Tip of the Week for Jan 25, 2018, we described Sum of Squares Within (SSW) as a measure of Variation in a group (sample, population or process) of data values. In ANOVA, Sum of Squares Between (SSB) is used together with SSW to determine whether there is a Statistically Significant difference among the Means of several groups.Here's a conceptual illustration of Variation Within and Between groups. Each bell-shaped curve represents a group. The widths of the curves represent how much variation there is within each. That is what SSW represents. For the variation between Means, we calculate the differences between the Means of each group and the Overall Mean. Then, we square those differences and then we sum those squares. This gives us the Sum of Squares Between, SSB. where X-bar is the Mean of an individual group and x-double-bar is the Overall Mean.
ANOVA then calculates a Mean SSB (MSB) and a Mean SSW (MSW). The previous Tip of the Week describes how these are used to calculate the value of the Test Statistic F in the F-test which produces the conclusion from the ANOVA.There is also more on this in my YouTube Video ANOVA-- Part 2, How It Does It. Sums of Squares are measures of variation; there are a number of different types. Our Tip of the Week for Dec. 21, 2017 described Sum of Squares Total (SST), which is used in Regression. Sum of Squares Within (SSW) is used in ANOVA. Sum of Squares Between (SSB) is also used in ANOVA, and it will be the topic of another Statistics Tip of the Week.Sum of Squares Within, SSW, is the sum of the Variations (as expressed by the Sums of Squares, SS's) (usually Samples)within each of several Groups . SSW = SS1 + SS2 + ... + SSnThis is not numerically precise, but conceptually, one might picture SS as the width of the "meaty" part of a Distribution curve – the part without the skinny tails on either side. Sums of Squares Within, SSW, summarizes how much Variation there is – by giving the sum of all such within each of the GroupsVariations.
A indicates that the data within the individual Groups are tightly clustered about their respective Means. If the data in each Group represents the effects of a particular treatment, for example, this comparatively small SSWis indicative of consistent results (good or bad) within each individual treatment. "Small" is a relative term, so the word " We'll need to compare SSW with SSB before being able to make a final determination. comparatively" is key here.A shows that the comparatively large SSWdata within the individual Groups are widely dispersed. This would indicate inconsistent results within each individual treatment. For more on the subject and related concepts see my video, ANOVA Part 2 (of 4): How it Does It. Randomness is likely to be representative, and Simple Random Sampling (SRS) can often be the most effective way to achieve it. But in certain situations, other methods such as Systematic, Stratified, and Clustered Sampling may have an advantage. In our Tip of the Week for November 27, 2017, Stratified Sampling was described. This Tip is about Clustered Sampling. To perform Clustered Sampling,- Divide the Population or Process into small clusters (e.g. city blocks)
- Select a Simple Random Sample of these clusters
- Collect data from each unit within each cluster
Advantage: It can be less time-consuming and less expensive. For example, the Population is the inhabitants of a city, and a cluster is a city block. We randomly select an SRS of city blocks. There is less time and travel involved in traveling to a limited number of city blocks and then walking door to door, compared with traveling to more-widely-separated individuals all over the city. Also, one does not need a Sampling Frame listing all individuals, just all clusters. Disadvantage: The increased Variability due to between-cluster differences may reduce accuracy. Chi-Square is a Test Statistic; as such, it has a family of Distributions. There is a different Chi-Square Distribution for each value of Degrees of Freedom, df.
Commonly, Degrees of Freedom is the Sample Size minus 1. But this isn't always the case with Chi-Square. That will be covered in a future Tip of the Week. Here are some examples of how the Chi-Square Distribution varies as df increases. You might observe that the shape of the Chi-Square Distribution is similar to that of the F-Distribution. For more on the Chi-Square, its Distributions and Tests, see the book or this video. "Reject the Null Hypothesis" is one of two possible outcomes of a Hypothesis Test. The other is "Fail to Reject the Null Hypothesis". Both of these statements can be confusing to many people. Let's try to clarify the concept of "Reject the Null Hypothesis". The Null Hypothesis states that there is no (Statistically Significant) - difference,
- or change,
- or effect
For example, - There is no difference between the Means of these two Populations.
- There has been no change in the Mean of this Population from its historical Mean.
- This treatment has no effect.
If the results of a statistical test indicate " Reject the Null Hypothesis", that means that we conclude that there is a (Statistically Significant)- difference,
- or change,
- or effect
What is the Null Hypothesis to which she is referring? As we said earlier, the Null Hypothesis says there is no difference, change, or effect. Before his proposal, they were not engaged to be married. So, if there is to be no difference, change, or effect in their relationship as a result of his proposal and her response, the Null Hypothesis would say that they are not to be engaged. But she
rejects the Null Hypothesis. This indicates that she does want there to be a difference, change, or effect. She does want to change their status to engaged to be married.If you still find this a little confusing, you might want to go to my YouTube channel and see the video on this subject: Reject the Null Hypothesis. There are also videos on these related concepts: __Null Hypothesis____Alternative Hypothesis____Fail to Reject the Null Hypothesis____Alpha, p, Critical Value and Test Statistic -- How They Work Together__
For more on available and planned videos based on content from my book, see the Videos page on this website. The concept of Sampling Distribution may not be used so much used on its own as it is used in describing the concepts of the Central Limit Theorem and Standard Error: __Central Limit Theorem__: Even if the data are not Normally distributed, the Sampling Distribution of the Means or Proportions of the data approaches the Normal Distribution as the Sample Size,*n*, increases.- The
__Standard Error__is defined as the Standard Deviation of the Sampling Distribution.
The visual we're going to use is similar to a Dot Plot of a Sample of data. Each dot represents a single value of x in the Sample. In the example below, 6 values in the Sample were x = 14, so we show 6 dots over "14" on the horizontal x-axis. In a Sampling Distribution, the values which are plotted are not individual data points as in this dot plot. The values are Statistics calculated from Samples. Statistics are numerical properties of Samples, such as the Mean or Proportion or Standard Deviation. Let's show a plot of the Means of some Samples of data. Instead of a dot, we'll show each Mean as an x-bar symbol.The x-bars are stacked vertically above the x axis at the point which represents their value. The height of each stack corresponds to the Probability of that value of the Mean.
This illustration shows a Sampling Distribution. The Sampling Distribution would show the Means of every possible Sample of a given size, n.You can see that the x-bars form a shape which roughly resembles a Normal Distribution. If we had many more Samples with a large n, the resemblance would be much closer. And if we calculate the Standard Deviation of the Distribution, we would get the Standard Error of the Mean. Randomness is likely to be representative, and Simple Random Sampling (SRS) can often be the most effective way to achieve it. But in certain situations, other methods such as Systematic, Stratified, and Clustered Sampling may have an advantage.
Stratified Sampling can be used when we know the Proportions of homogeneous groups which make up the population. Stratified Sampling- Divide the Population or Process into homogeneous groups (strata).
- Select a Simple Random Sample from each group such that the Sample Size for each group corresponds to a known Proportion of the group in the population or process
n = 100, we Randomly select 55 women and 45 men.Advantage: Stratified Sampling avoids selecting a Sample which is not representative -– at least with regard to the Proportions of the homogenous groups.Disadvantage: It can't be used when there are no homogeneous groups. |
## AuthorAndrew A. (Andy) Jawlik is the author of the book, Statistics from A to Z -- Confusing Concepts Clarified, published by Wiley. ## Archives
March 2018
## Categories |