Brendan O ‘Connor’s blog Statistics vs. Machine Learning, Fight! The first draft was written in 2008, which may be related to the author’s machine learning background. In the first draft, he mainly downplayed statistics. His idea is similar to [1], and he thinks that machine learning has more Algorithm Modeling content than statistics, such as MAX-margin of SVM, decision tree, etc. Moreover, he thinks machine learning is more practical. But in October 2009 he abandoned his original view that statistics were the real deal: Statistics, not machine learning, is the real deal, but unfortunately suffers from bad marketing.

Machine learning Statistics
network, graphs model
weights parameters
learning fitting
generalization test set performance
supervised learning Regression/classi fi cation
unsupervised learning density estimation, clustering
Large grant = $1000000 Large grant = $50000
nice place to have a meeting:Snowbird, Utah, French Alps nice place to have a meeting:Las Vegas in August

Differences in research methods

  • Formalization and derivation of statistical research
  • Machine learning is more tolerant of new approaches

The dimension difference

Statistics emphasizes statistical derivation of low-dimensional space problems (Confidence Intervals, hypothesis tests, Optimal Estimators)

  • Machine learning emphasizes high dimensional prediction problems
  • Statistics and machine learning are more concerned with:
  • Statistics: survival analysis, spatial analysis, multiple testing, minimax theory, deconvolution, semiparametric inference, bootstrapping, time series.
  • Machine learning: Online Learning, Semisupervised Learning, Manifold Learning, Active Learning, Boosting.

Statistical learning and machine learning terminology differences:

statistical Machine learning
Estimation         Learning Classifier  
hypothesis Data point       
Example/Instance Regression  
Supervised Learning Classification 
Supervised Learning Covariate   
Feature Response           Label