TTIC 31250 - An Introduction to the Theory of Machine Learning (Spring 2020)

Avrim Blum

Mon/Wed 3:00-4:20pm, Online

Office hours: email me and we can arrange a time to talk

Course description: This course will cover some of the basic theory underlying machine learning and the process of generalizing from data. We will talk about both the PAC model for batch learning (learning over one set of data with the intent of producing a predictor that performs well on new data) and models for learning from feedback over time (online learning). We will discuss important fundamental concepts including overfitting, uniform convergence, formal notions of Occam's razor, VC-dimension, and regularization, as well as several classic algorithms including the Perceptron algorithm, SVMs, algorithms for combining expert advice, and boosting. We will also discuss limited-feedback (bandit) algorithms, connections between learning and game theory, and formal guarantees on privacy. This will be a proof-oriented course: our focus will be on proving performance guarantees for algorithms that aim to generalize from data as well as understanding what kinds of performance guarantees we can hope to prove.

Prerequisites: The main prerequisite is comfort with a proof-oriented course, having taken some algorithms class, and comfort with basic probabilistic modeling and reasoning. For example, 1000 programs are submitted to a stock-market prediction challenge, and we find that one of those programs has correctly predicted whether the market will go up or down the next week for 10 weeks in a row; should we feel confident that the program is a good one? Comfort with thinking about points and vectors in high-dimensional space is a plus.

Coursework: We will have 5 homework assignments, a small course project (a 4-5 page writeup based on exploring a theoretical idea, or trying some experiments, or reading and explaining a recent research paper), and a take-home final worth 1-2 homeworks. In addition, students will be asked to help with grading one of the assignments. The course project is due on the day of the last class, and the take-home final will be made available then and can be taken at your convenience during finals week.

Homeworks

Homework 1.
Homework 2.
Homework 3.
Homework 4.
Homework 5.
Course Project: Due June 3.
Take-home final: due 24 hours after you begin, or on June 12, whichever is earlier.

Lecture Notes and Tentative Plan

04/06: Introduction. The PAC model, Overfitting, and Occam's razor. See also Sections 5.1-5.3 of BHK.
04/08: Online Learning: Mistake-Bounds, Combining Experts, Multiplicative Weights, Regret Minimization. See also 5.5.1, 5.5.2, and 5.7 of BHK and 4.1-4.3 of this book chapter.
04/13: The Perceptron Algorithm, Margins, and intro to Kernels. See also Sections 5.5.3, 5.5.4, and 5.6 of BHK.
04/15: Support Vector Machines. See also this Tutorial on SVMs and kernel methods by Hofmann, Scholkopf, and Smola.
04/20: Uniform Convergence and VC-Dimension I. See also Section 5.9 of BHK.
04/22: Uniform Convergence and VC-Dimension II. See also Section 5.9 of BHK.
04/27: Rademacher Bounds. See also the monograph by Bousquet, Boucheron, and Lugosi.
04/29: Boosting. See also Section 5.10 of BHK.
05/04: Statistical Query Model I. Here is a recent tutorial.
05/06: Statistical Query Model II. For more recent developments, see this paper of Vitaly Feldman.
05/11: Computational Hardness Results for Learning.
05/13: Learning and Game Theory I. [end at 4:00pm]
05/18: Learning and Game Theory II: Book Chapter on Learning, Regret Minimization, and Equilibria. [start at 3:30]
05/20: Bandit algorithms.
05/27: Learning Finite-State Environments.
06/01: Class cancelled today.
06/03: Differential Privacy and Learning. [Course projects due today]

The take-home final will be made available starting 06/03.

Recommended texts/readings:

A. Blum, J. Hopcroft, and R. Kannan, Foundations of Data Science (Chapter 5). Called "BHK" above.
S. Shalev-Shwartz and S. Ben-David, Understanding Machine Learning: From Theory to Algorithms
O. Bousquet, S. Boucheron, and G. Lugosi, Introduction to Statistical Learning Theory.