TTIC 31120: Computational and Statistical Learning Theory

This is a web page for the Fall 2016: course "Computational and Statistical Learning Theory", taught at TTIC, and also open to all University of Chicago students.

Tuesdays and Thursdays 10:30-11:50AM in TTIC 530
Instructor: Nati Srebro.
TA: Jialei Wang, Email: jialei@uchicago.edu
Office hours: Fridays 1:40-3:00PM, TTIC Library

Course Description

The purpose of this course is to gain a deeper understanding of machine learning by formalizing learning mathematically, studying both statistical and computational aspects of learning, and understanding how these two aspects are inseparable. The course is intended both for students interested in using machine learning methods and that would like to understand such methods better so as to use them more effectively, as well as for students interested in the mathematical aspects of learning or that intend on rigorously studying or developing learning algorithms.

We will discuss classic results and recent advances in statistical learning theory (mostly under the agnostic PAC model), touch on computational learning theory, and also explore the relationship with stochastic optimization and online regret analysis. Our emphasis will be on concept development and on obtaining a rigorous quantitative understanding of machine learning. We will also study techniques for analyzing and proving performance guarantees for learning methods.

Prerequisites

Mathematical maturity, as obtain, e.g., in a rigorous analysis course.
Discrete Math (specifically combinatorics and asymptotic notation)
Probability Theory
Introduction to Machine Learning
Algorithms; Basic Complexity Theory (NP-Hardness)

Familiarity with Convex Optimization, Computational Complexity and background in Statistics can be helpful, but is not required.

Specific Topics:

We will try to touch:

The Statistical Model (Learning Based on an IID Sample):
- The PAC (Probably Approximately Correct) and Agnostic PAC models.
- Stochastic Optimization
- Cardinality Bounds
- Description Length Bounds
- PAC-Bayes
- Compression Bounds
- The Growth Function and VC Dimension
- VC Subgraph Dimension and Fat Shattering Dimension
- Tight Characterization of Learning in terms of the VC and Fat Shattering Dimensions
- Covering Numbers
- Rademacher Averages, including Local Rademacher Analysis
Uniform Learning and No-Free Lunch Theorems
Online Learning, Online Optimization and Online Regret
- The Perceptron Rule and Online Gradient Descent
- Experts and the Winnow Rule
- Bregman Divergence and Online Mirror Descent
- Online to Batch Conversion
Computational Lower Bounds:
- Computational Hardness of Proper Learning
- Cryptographic Hardness of Learning
Additional Topics
- Stability Based Analysis
- Boosting: Weak Learning and the Margin Interpretation of Boosting.

Requirements and Grading:

Required Reading: Some material will be covered through assigned reading, rather than in-class.
Grading: Your grade will be based on the final exam and homeworks (they contribute equally to the final grade).
- Problem Sets: About 4-5 problem sets, some of which introducing additional material beyond the material covered in class. Some problem sets will also include optional extra-credit problems.
- Final Exam: The final exam will mostly consist of claims where you will be required to either affirm the claim is true (without providing a proof), or provide a counterexample to the claim.

Recommended Texts

We will not be following a specific text, but some of the material is covered in:

Shai Shalev-Shwartz and Shai Ben-David. Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press, 2014.
O. Bousquet, S. Boucheron, and G. Lugosi. "Introduction to statistical learning theory." Advanced Lectures on Machine Learning, pp. 169-207. Springer Berlin Heidelberg, 2004.
Michael J. Kearns and Umesh Virkumar Vazirani. An Introduction to Computational Learning Theory. MIT Press, 1994.

Schedule and Lectures

Week of	Tuesday	Thusday
September 27th	What is Learning?	PAC Learning and VC Theory I
October 3rd	PAC Learning and VC Theory II	MDL and PAC-Bayes
October 10th	Computational Complexity of Learning	Proper vs Improper Learning
October 17th	Agnostic Learning	Boosting and Compression Schemes
October 24th	Real-Valued Loss	Scale-Sensitive Classes
October 31th	SVMs, L1 Regularization, Boosting	Regularized Learning, Stability
November 7th	Online Learning	FTRL,OGD
November 14th	Online Dual Averaging, Mirror Descent	Online to Batch, Stochastic Optimization
November 21th	Optimistic Rate, KNN	No Class
November 28th	Neural Networks, Course Summary	No Class

Assignments:

Problem Set 1 (due October 10th)
Problem Set 2 (due October 24th)
Problem Set 3 (due November 7th)
Problem Set 4 (due December 2nd)

Sample Final Exam

Sample final exam