Friday, 6 September 2019

What is the use of statistics in science?


What is the use of statistics in science?

This is a good question, that came on my Quora feed.  I would say that essentially every present-day science uses statistical techniques as a key process in exploratory research and theory confirmation, to a greater or lesser extent (usually greater).

Sometimes these are traditional statistical methods, other times they are newer “data science” techniques such as machine learning (most of these algorithms draw upon statistical and probabilistic concepts).  The statistical modelling methods are often better for inference (understanding what is going on), while machine learning methods are often better at prediction or categorization.  That’s not a hard and fast rule, but a useful generalization, I think.

As a general statement, pretty well all lab sciences depend on statistical methods for error analysis.  The ideas behind hypothesis testing and confidence intervals are also ubiquitous in physical, medical and social sciences.

Here are some more specific examples. 
 
·       Physics: If you take a physics degree, you are likely to come to a course called “Statistical Physics” or words to that effect.  So, that tells you something right there.  Here are a few other examples, and not at all an exhaustive list.
o   A lot of thermodynamics has a heavy statistical focus (e.g. temperature is explained as the average kinetic energy of a large ensemble of molecules).
o   Geophysics leans very heavily on statistical theory, especially time series analysis (e.g. I took a course called “Time Series Analysis in Geophysics”).  Very useful in exploration geophysics and earthquake analysis.
o   Astrophysics makes extensive use of statistics in analysis of stellar spectra (e.g. Power spectrum analysis helps to find planets in other solar systems).  A lot of error analysis was developed for astronomical observations in earlier centuries.
o   Particle physics makes use of statistical techniques to establish whether a new particle has actually been discovered (e.g. you hear things like “it is a 3 sigma event” when reports from particle accelerators are discussed).

·       Geology and Earth Sciences: This makes use of statistics for many reasons:
o   Geophysical methods as outlined above.
o   Methods for estimating ore reserves or petroleum deposits depend heavily on sampling theory.

·       Biological Sciences: Lots of statistical techniques used here as well:
o   Population genetics makes use of many statistical methods, such as cluster analysis to make taxonomical decisions.
o   Molecular genetics (e.g. the big GWAS studies about human evolution) uses statistical methods, such as regression (whether via statistical methods or machine learning methods).
o   A lot of statistical theory goes back to agricultural studies (e.g. “split-plot design” in ANOVA).

·       Computing Science:
o   The newer machine learning methods often make extensive use of statistical theory:
o   Computer network design makes use of principles from probability theory for matter such as queuing algorithms.

·       Psychology and Other Quantitative Social Sciences:
o   These disciplines make extensive use of statistical methods, such as regression, ANOVA, factor analysis and cluster analysis.
o   Demography is an important sub-discipline of sociology, which is very statistical in nature.

·       Economics:
o   Economics makes extensive use of various types of regression analysis (e.g. OLS, logit, time series) and other very complex methods.
o   Marketing is very heavily dependent on statistical analysis.  In fact, many statistical methods came about for marketing purposes (e.g. A/B studies).

·       Medicine:
o   Makes very extensive use of statistical theory to determine the efficacy of new drugs and treatments.
o   Meta-analysis is widely used to do “studies of studies”.
o   Epidemiology is essentially statistical in nature (e.g. the famous study of how typhus was spread via water pumps was essentially a correlation study).


Here are a few more light-hearted examples of the use of statistics in science, mostly from XKCD: 

Interpreting p-values for your research paper.



 A bit of everyday sociology, using statistics.




o   Data Science overconfidence.



o   Earthquake prediction.
o   



-----------------------------------------------------------------------------------------------------------------
Now that you have read about science and statistics, you might want to relax and read some science fiction.  The Witch’s Stones series would be an excellent choice.  Alternatively, you could try the short story “The Magnetic Anomaly”, a SF story which includes an example of the use of statistical methods in geophysics, namely Fourier Analysis (though there are no equations). 

The Witches’ Stones

Or, you might prefer, the trilogy of the Witches’ Stones (they’re psychic aliens, not actual witches), which follows the interactions of a future Earth confederation, an opposing galactic power, and the Witches of Kordea.  It features Sarah Mackenzie, another feisty young Earth woman (they’re the most interesting type – the novelist who wrote the books is pretty feisty, too).



The Magnetic Anomaly: A Science Fiction Story

“A geophysical crew went into the Canadian north. There were some regrettable accidents among a few ex-military who had become geophysical contractors after their service in the forces. A young man and young woman went temporarily mad from the stress of seeing that. They imagined things, terrible things. But both are known to have vivid imaginations; we have childhood records to verify that. It was all very sad. That’s the official story.” 






No comments:

Post a Comment