TTIC 31120: Computational and Statistical Learning Theory

This is a webpage for the Winter 2011 course "Computational and Statistical Learning Theory", taught at TTIC, and also open to all University of Chicago students.

Lectures: Monday and Wednesday 3-4:30pm at TTIC 530 (located at 6045 S. Kenwood Ave, fifth floor).

Instructor: Nati Srebro.
TA: Karthik Sridharan.

Course Description

We will discuss classic results and recent advances in statistical learning theory (mostly under the agnostic PAC model), touch on computational learning theory, and also explore the relationship with stochastic optimization, the oracle model of optimization, and online regret analysis. Our emphasis will be on developing a rigorous quantitative understanding of machine learning and acquiring techniques for analyzing and proving performance guarantees for learning methods.

Pre-Requisites

Mathematical maturity, as obtain, e.g., in a rigorous analysis course.
Discrete Math (specifically combinatorics and asymptotic notation)
Probability Theory
Introduction to Machine Learning
Algorithms; Basic Complexity Theory (NP-Hardness)

Familiarity with Convex Optimization, Computational Complexity and background in Statistics can be helpful, but is not required.

Specific Topics:

We will try to touch

The Statistical Model (Learning Based on an IID Sample):
- The PAC (Probably Approximately Correct) and Agnostic PAC models.
- Stochastic Optimization
- Hoeffding and Bernstien Inequalities
- Cardinality Bounds
- Description Length Bounds
- PAC-Bayes
- Compression Bounds
- The Growth Function and VC Dimension
- VC Subgraph Dimension and Fat Shattering Dimension
- Tight Characterization of Learning in terms of the VC and Fat Shattering Dimensions
- Covering Numbers
- Radamacher Averages, including Local Radamacher Analysis
Uniform Learning and No-Free Lunch Theorems
Online Learning, Online Optimization and Online Regret
- The Perceptron Rule and Online Gradient Descent
- Experts and the Winnow Rule
- Bergman Divergence and Online Mirror Descent
- Online to Batch Conversion
Computational Lower Bounds:
- Computational Hardness of Proper Learning
- Cryptographic Hardness of Learning
- Lower Bounds in the Oracle Model of Optimization
Additional Topics
- Stability Based Analysis
- Boosting: Weak Learning and the Margin Interpretation of Boosting.
- Comparison of assumptions and results with typical statistical study of regression.

Requirements and Grading:

Required Reading: Some material will be covered through assigned reading, rather then in-class.
Scribing: students are required to scribe (prepare lecture notes) one lecture. This is required in order to receive a grade in the class, and the quality of the notes will be taken into consideration in determining the grade.
Exercises and Problems: every 1-2 weeks three types of problems will be posted:
- Exercises: These are relatively easy review exercises, often covering required reading not covered in class. A random subset of these will be graded.
- Challenge Problems: These are harder problems that might be a challenge to solve. Successfully completing at least 1-2 of these is required for receiving an "A" in the class. Students will also be asked to present and discuss (with the instructor) the solutions to 1-2 of these.
- Research Problems: I don't know how to solve these (though I might have an idea). Solving one of these will likely earn an "A+" in the class and might lead to a publication.
There will be a short final in-class quiz on basic concepts covered in the class.

In order to receive a "B" a student is expected to complete the exercises, do a good job at scribing a lecture and display reasonable understanding on the quiz.
In order to receive an "A" a student is expected to complete the exercises, do a very good job scribing a lecture, successfully answer at least 1-2 Challenge Problems and present them to the instructor satisfactorily.

Lectures and Required Reading:

Lecture 1: Monday January 3rd

Course Overview
Basic Setup: Generalization Error, Learning Rule, Learning Algorithm
Analysis of the "Minimum Consistent Rectangle" learning rule in the realizable case.
No Free Lunch Theorem.

Required Reading: Section 1 of: S. Boucheron, O. Bousquet, and G. Lugosi, (2004), Concentration inequalities, Advanced Lectures in Machine Learning, Springer, pp. 208--240, 2004 Alternative reading : Lecture notes
Homework 1 out (due 5th Jan, 2010 (preferable), ultimate deadline 10th Jan, 2010)

Lecture 2: Wednesday January 5th

Empirical Risk Minimization
Finite cardinality generalization bound and learning gurantee.
The growth function.

Lecture 4: Monday January 10th VC-dimension and the Shatter Lemma.

Lecture 5: Wednesday January 13th PAC Learnability and the VC dimension. Description Length based learning rules and gurantees, Structural Risk Minimization.

No Lecture on Monday January 17th (MLK Day)

Lecture 6: Wednesday January 19th, given by Karthik PAC-Bayes bound. Compression scheme bounds. Homework 2 out (due January 31st)

Lecture 7: Monday January 24th Universal learning. Computational considerations: efficient PAC learning and proper PAC learning.

Lecture 8: Wednesday January 26th Efficient proper PAC learning implies Consistency is in RP. Hardness of efficiently PAC learning 3-TERM-DNFs, even though they can be learned improperly.

Lecture 9: Monday January 31st Inherent hardness of efficient PAC learning. Existence of hypothesis classes that are easy to learn statistically but hard to learn computationally, based on counting arguments. Overview of cryptographic hardness of efficient PAC learning. Hardness of efficient PAC learning vs efficient agnostic PAC learning. Homework 3 out (due 16th Feb, 2011) Homework 4 out (due 2nd March, 2011) Homework 5 out (due 16th March, 2011)

Assignments:

Homework 1 (due 5th Jan, 2011 (preferable), ultimate deadline 10th Jan, 2010) Note: updated January 4th afternoon(solutions to earlier posted version will also be accepted)

Homework 2 (due 31st Jan, 2011)

Homework 3 (due 16th Feb, 2011)

Homework 4 (due 2nd March, 2011)

Homework 5 out (due 16th March, 2011)

Last modified: Thu Jan 27 15:09:06 Central Standard Time 2011