Category: Stats Tip Of The Week

Statistics Tip of the Week: Simple Non-linear Regression fits a curve to non-linear x,y data.

8/22/2017

The "Simple" means that there is only one x variable. y = f(x). So the curve exists in two dimension and can be plotted on screen or sheet of paper. The "non-linear" means that, when we graph the data, it does not roughly follow a straight line, so we must look for an appropriate curve.

The following are probably the most common non-linear curves used:

Exponential and Logarithmic curves have rapid accelerations or decelerations
in the slope. Power curves have a more gradual change. Polynomial functions
can be used for more complex curves, which change direction, as shown in our Tip of the Week for May 17, 2017

0 Comments

Statistics Tip of the Week: an Exponential Distribution can be used with problems involving time.

8/17/2017

0 Comments

The Exponential Distribution is useful for solving problems involving time to an event or time between events -- for example, time between emergency calls or time between equipment failures.

It is especially useful with events that are relatively rare. If one were to analyze rare events per time period, using the Poisson Distribution, for example, the Counts might include a lot of zeros and an occasional 1. It may be more meaningful to think in terms of the time between events and measure the data that way. Then the Exponential Distribution could be used.

An individual Exponential Distribution can be specified by just one Parameter – either the Mean (µ), or the Rate (λ).
λ = 1/ µ
(If the Mean time to an event is 8 hours, then the Rate at which the events occur is 1/8 per hour.)

An interesting fact about all Exponential Distributions: The Mean always splits the Cumulative Probabilities (areas under the curve) into 63% and 37%.

0 Comments

In 2-Factor ANOVA, Intersecting lines can indicate an Interaction

8/10/2017

0 Comments

In our March 2, 2017 Tip of the Week, we discussed 2-Factor (aka 2-Way) ANOVA. We said that:

Separated lines indicate that Factor A has an Effect, and
Slanted lines indicate that Factor B has an Effect

The Illustrations all showed the lines for Factors A and B as being parallel. We didn't mention it in that blog post, but parallel lines indicate that the two Factors do not interact.

For example, in the left diagram, above, both detergents behave the same way to a change in water temperature -- they show no change in their Effect -- cleanliness. Likewise, the middle and right diagrams show both detergents behaving the same -- a parallel increase in Effect.

But what if we got something like the two graphs below? In the example at left, Detergent #1 shows a substantial increase in effectiveness as the water temperature is increased. But for Detergent #2, heating the water has the opposite effect: its effectiveness is decreased.

In the example on the right, both detergents show an increase in effectiveness as water temperature increases. But Detergent #2's increase is fairly minor. In fact, its increase may not be Statistically Significant and the Interaction may not be Statistically Significant..

In either case, we do have reason to suspect an Interaction. so 2-Way ANOVA Without Replication cannot be used. We must use 2-Way ANOVA With Replication.

The With Replication method repeats (Replicates) the experiment several times for each combination of Factor A and B Values. This can provide sufficient data to quantify an Interaction.

The number of Replications required to achieve a specified level of accuracy is determined by the methods of Design of Experiments, DOE. The Design also specifies the levels of each Factor to be used in each replication, the order of replication and other specifics of the experiment. The book has a 3-part series on DOE, and eventually there may be a video series on it as well.

There is currently a video on the book's YouTube channel with more information about the subject of this post: ANOVA -- Part 4 (of 4): 2-Way (aka 2-Factor).

0 Comments

Statistics Tip of the Week: Alpha and Beta Errors -- Compared and Contrasted

7/26/2017

2 Comments

Statistics Tip of the Week: From Alpha to Critical Value to Confidence Interval

7/18/2017

0 Comments

In our November 17, 2016 Tip of the Week, we showed how the concept of Alpha leads to the concept of Critical Value. This Tip shows how that, in turn, leads to the Critical Value.

The person performing the test selects the value for Alpha, the Significance Level.
This value is plotted as an area under the Distribution curve of a Test Statistic, (z in this instance). This example shows a 2-sided test in which Alpha/2 is shaded under each tail of the Distribution.
The boundary of each shaded area is the Critical Value. It is in units of the Test Statistics (units of z's in this instance).
We then use the formula for the Test Statistic to convert the Critical Value(s) into the units of the data -- centimeters in this example.
These values define the Confidence Limit(s), which, in turn, define the Confidence Interval.

0 Comments

Statistics Tip of the Week : Sample Sizes and Margins of Error for Proportions

7/11/2017

0 Comments

This may be a handy table to keep around somewhere. How big a Sample Size do we need if we want to differentiate between 2 choices in a survey or election? It's more than people usually think.

Some might have in mind the guidance on when to use the t Distribution instead of the z (Standard Normal) Distribution. We're told we can use z when n, the Sample Size, is "large". And then we learn that some consider 30 to be large enough, while others say 100.

But as you can see from this table, n = 100 barely gets you into the game when you're doing a survey or poll. When n = 100, you have a 10% Margin of Error (MOE). That is, you can say that you have a Statistically Significant difference if your Proportions are wider spread than 44% and 55% for the 2 candidates.

But to get to a 2% MOE, you'd need a Sample Size of 2,400. Notice also, that diminishing returns set in. To get to a 1% MOE, you'd need a sample 4 times larger than you would for 2%.

0 Comments

Statistics Tip of the Week: "Fail to Reject the Null Hypothesis" is a triple negative -- making it a negative statement.

7/3/2017

1 Comment

In Hypothesis Testing, "Fail to Reject the Null Hypothesis" is one of two possible conclusions to be drawn from the test. The other is "Reject the Null Hypothesis."

We are all taught in elementary school to avoid using double negatives, like "I don't have no money." However, statistics goes beyond the double negative to an even more confusing triple negative: "Fail to Reject the Null Hypothesis." Fail, Reject, and Null are all negative words This is like saying, "I don't not have no money."

That statement confuses a lot of people. It may help to understand better if we represent a positive statement by +1 and a negative by -1. In Hypothesis Testing, we are usually trying to determine whether there is a Statistically Significant difference, change, or effect.

For more on how to clarify this confusing concept, please see my video, Fail to Reject the Null Hypothesis. It has received very good reviews, like these:

1 Comment

Statistics Tip of the Week: Which Distribution to use when

6/12/2017

0 Comments

Statistics Tip of the Week: Use a Line Chart to see Trends, Effects, or Interactions

6/6/2017

0 Comments

In earlier blog posts, we've seen how different types of charts can be used in different ways to increase our understanding of the data and to communicate that understanding to others:

Bar Chart and Histogram: http://bit.ly/2o8qLr5
Dotplot: http://bit.ly/2oMvzxL and Boxplot: http://bit.ly/2fAq7uX
Scatterplot: http://bit.ly/2hADyO0 and http://ow.ly/Pzih309hJz5

Here's another example: a Line Chart uses lines to connect points that have adjacent values on the horizontal axis. It is often used to illustrate trends, with the horizontal axis representing time.

It is also used to graph cause-and-effect, in which the x Variable (horizontal axis) is the Factor which causes the Effect in the y Variable (vertical axis). In the chart below, an increase in the Factor Variable, water temperature, causes an increase in the Effect Variable, cleanliness. This is used in Regression analysis and in the Designed Experiments which are conducted to test a Regression Model.

The following chart combines two line charts into one. It has the same x and y Variables as the previous chart, but it adds a second Factor (x) Variable, Detergent. So, there are two lines, connecting two sets of data points. In 2-Way ANOVA, crossing lines indicate that there is an Interaction between the two Factors. In this case, an increase in temperature has the opposite effect for the two detergent types – it makes Detergent #1 do better, and it makes Detergent #2 do worse. If the lines were parallel or did not cross, then there would be no Interaction.

In a similar fashion, a Line Chart can help differentiate between Observed and Expected Frequencies in a Chi-Square test for Goodness of Fit.

Reproduced by permission of John Wiley and Sons, Inc.
from the book, Statistics from A to Z -- Confusing Concepts Clarified

0 Comments

#Statistics Tip of the Week: Nonparametric counterparts for Parametric tests

6/1/2017

3 Comments

A "Statistic" is a measure of a property of a Sample, for example, the Sample Mean or Sample Standard Deviation. The corresponding term for a Population or Process is "Parameter".

The most commonly used statistical tests are "Parametric", that is, they require that one or more Parameters meet certain conditions or "assumptions". Most frequently, the assumption is that the Distribution of the Population or Process is roughly Normal. Roughly equal Variance is also a common assumption.

If these conditions are not met, the Parametric test cannot be used, and a Nonparametric test must be used instead. This table shows the Nonparametric test that can be used in place of several common Parametric tests.

3 Comments

<<Previous

Forward>>

Statistics Tip of the Week: Simple Non-linear Regression fits a curve to non-linear x,y data.

Statistics Tip of the Week: an Exponential Distribution can be used with problems involving time.

In 2-Factor ANOVA, Intersecting lines can indicate an Interaction

Statistics Tip of the Week: Alpha and Beta Errors -- Compared and Contrasted

Statistics Tip of the Week: From Alpha to Critical Value to Confidence Interval

Statistics Tip of the Week : Sample Sizes and Margins of Error for Proportions

Statistics Tip of the Week: "Fail to Reject the Null Hypothesis" is a triple negative -- making it a negative statement.

Statistics Tip of the Week: Which Distribution to use when

Statistics Tip of the Week: Use a Line Chart to see Trends, Effects, or Interactions

#Statistics Tip of the Week: Nonparametric counterparts for Parametric tests

Author

Archives

Categories