Correlation does not imply Causation. We can calculate a Correlation Coefficient, r, which will tell us that 2 Variables are strongly correlated. However, it doesn't tell us whether x causes y or vice versa. Or, there could be 3rd Variable which influences the behavior of both x and y. Statistics alone can't tell us that. We need human subject matter experts to tell us what the real-world mechanisms and influencers are. If they confirm that there are likely mechanisms for the observed Correlation to be a cause and and an effect relationship, then we can use the data we have collected to develop a Regression Model to describe and size this cause and effect. Are we done? No. We have to statistically demonstrate that the Regression Model describes what is actually happening. To prove the validity of the Regression Model, we cannot re-use the data we used to develop it. We must go out and collect new data to test the validity of our Regression Model. We test our Regression Model with the new data, using methods described in the discipline of Design of Experiments. These experiments can give strong evidence of the causation -- or not. This is very much a simplification what goes on with several complicated and potentially confusing concepts. The book has 2 articles on Correlation, 5 articles on Regression, and 3 on Design of Experiments. Eventually, I hope to upload videos for each of these articles to my YouTube channel, "Statistics from A to Z -- Confusing Concepts Clarified". The videos page on this website will always show the latest status of the videos which are available and planned next to come.
1 Comment
|
AuthorAndrew A. (Andy) Jawlik is the author of the book, Statistics from A to Z -- Confusing Concepts Clarified, published by Wiley. Archives
March 2021
Categories |