In the second semester of 2020-2021, I teach the new class on Foundations of Statistics and Machine Learning (detailed course information available via bright space; for interested non-Leiden students I provide the rough course outline below).

In the second semester of 2019-2020, I teach the class on information-theoretic learning at Leiden University (course webpage).

In the first semester of 2019-2020, I teach, with group members Wouter Koolen and Rianne de Heide, a mastermath course on Machine Learning theory. Mastermath is the joint Universities National Dutch Master’s Programme in Mathematics.

Outline Foundations of Statistics and Machine Learning Class, Leiden 2021

The motivation for preparing this course are the very significant problems with standard statistics (testing, estimation with confidence intervals) in the 20th century and the fact that some of the underlying ideas of standard statistics have important but largely unknown repercussions for machine learning.

Yet the real topic of this course, which allows to organize the discussion in a coherent way, is the concept of likelihood, which plays an essential but different role in all (at least four!) classical approaches towards statistics!

Disclaimer: the orders and contents of the lectures may change at any time for any reason.

Lecture 1, February 3rd 2021: The Likelihood and the (Problems with) the p-value.

  • The Basic Neyman-Pearson Theory
    • Neyman Pearson Lemma: likelihood ratio test is optimal in Neyman-Pearson sense for point vs. point hypothesis
  • Fisher’s testing with p-values
    • The p-value based on standard test statistics is (for simple enough models) a monotone function of the likelihood
  • The Neyman-Pearson-Fisher problem with optional stopping
  • Many other problems with p-values!
  • Short Overview of the Whole Course (illustrated with point/point hypotheses)

Lecture 2 (Feb. 10th) and 3 (Feb 17th): likelihood, Wald, and Bayes

  • Wald’s sequential probability ratio test (Neyman Pearson with stopping time determined by likelihood)
  • Wald’s Fundamental result about Type I and II Error
  • General Recap of Basic Probability Theory and Statistics: Maximum Likelihood, Conditioning, Bayes Thm, Chain Rule,
  • Bayesian testing
  • The Bayes factor

Lecture 4 (Feb. 24) and 5 (March 3): The Likelihood Principle, the Stopping Rule Principle, Admissible Statistical Procedures (literature: handout Berger & Berry, Ferguson)

Lecture 6, 7 and 8 (March 10-17-31) (no lecture on March 24th!): Minimum Description Length Learning (literature: MDL Revisited (G. and Roos), MDL book Chapters 3, 4, 6):  likelihood and data compression

  • minus log-likelihood as codelength
  • the fundamental role of Jeffreys’ prior

Lecture 9, 10 (April 7, 14): Kelly Gambling (literature: G. Shafer, Testing by Betting) – likelihood as money

  • likelihood ratio as capital growth factor
  • E-Variables and Test Martingales – the best of all worlds?
  • Always-Valid Confidence Sequences

Lecture 11: Link with Machine Learning I (April 21): Generalized Bayes

Lecture 12: Link with Machine Learning II (April 28): PAC-Bayesian Theorems for General Loss Functions

Lecture 13: Link with Machine Learning III (May 12) (no lecture on May 5!): Bandits

Lecture 14: Overview and Wrap-Up (May 19).

  • Perhaps some additional topics: likelihood ratios in forensics, for example