Many times we data scientists, not being statisticians in the strictest sense, have concern that we may commit some kind of statistical faux pas. Fear no more! With the release of a probing new book “Statistics Done Wrong” by Alex Reinhart, we now have a curious road map for avoiding statistical fallacies. As a Ph.D. student and statistics instructor at Carnegie Mellon University, Reinhart shows how scientific progress depends on good research, and good research needs good statistics. But statistical analysis is tricky to get right, even for the best data scientists. You’ll be surprised how many practicing data scientists are doing it wrong.
Although written for a broad audience of scientific researchers, I found the book compelling for me personally as someone working daily in data science. Many of the same principles I use regularly, such as linear regression, overfitting, confounding variables, cross validation, feature selection, p-values, confidence intervals, etc., are familiar concepts covered in the book.
The best part of the book is all the examples of statistical blunders in modern science. Reinhart provides ample cases of embarrassing errors and omissions in recent research. You’ll learn about the misconceptions and scientific politics that allow these mistakes to happen, and lead you to a path of reform in the way you do statistics.
Here is a list of chapters:
Chapter 1 – An Introduction to Statistical Signficance
Chapter 2 – Statistical Power and Underpowered Statistics
Chapter 3 – Pseudoreplication: Choose Your Data Wisely
Chapter 4 – The P Value and The Base Rate Fallacy
Chapter 5 – Bad Judges of Signficance
Chapter 6 – Double-Dipping in the Data
Chapter 7 – Continuity Errors
Chapter 8 – Model Abuse
Chapter 9 – Researcher Freedom: Good Vibrations
Chapter 10 – Everybody Makes Mistakes
Chapter 11 – Hiding the Data
Chapter 12 – What Can Be Done?
I think “Statistics Done Wrong” is an important addition to any data scientists library. In addition, the pithy writing style will keep your interest and fuel your creativity for future projects. Highly recommended.
Daniel D. Gutierrez – Managing Editor
Sign up for the free insideBIGDATA newsletter.