**MWF 10:30-11:50AM in TTIC 530** (some of the classes will be canceled. Please see the schedule at the bottom of the page.)

Instructor: Nati Srebro.

TA: Behnam Neyshabur.

The purpose of this course is to gain a deeper understanding of machine learning by formalizing learning mathematically, studying both statistical and computational aspects of learning, and understanding how these two aspects are inseparable. The course is intended both for students interested in using machine learning methods and that would like to understand such methods better so as to use them more effectively, as well as for students interested in the mathematical aspects of learning or that intend on rigorously studying or developing learning algorithms.

We will discuss classic results and recent advances in statistical learning theory (mostly under the agnostic PAC model), touch on computational learning theory, and also explore the relationship with stochastic optimization and online regret analysis. Our emphasis will be on concept development and on obtaining a rigorous quantitative understanding of machine learning. We will also study techniques for analyzing and proving performance guarantees for learning methods.

- Mathematical maturity, as obtain, e.g., in a rigorous analysis course.
- Discrete Math (specifically combinatorics and asymptotic notation)
- Probability Theory
- Introduction to Machine Learning
- Algorithms; Basic Complexity Theory (NP-Hardness)

- The Statistical Model (Learning Based on an IID Sample):
- The PAC (Probably Approximately Correct) and Agnostic PAC models.
- Stochastic Optimization
- Cardinality Bounds
- Description Length Bounds
- PAC-Bayes
- Compression Bounds
- The Growth Function and VC Dimension
- VC Subgraph Dimension and Fat Shattering Dimension
- Tight Characterization of Learning in terms of the VC and Fat Shattering Dimensions
- Covering Numbers
- Rademacher Averages, including Local Rademacher Analysis

- Uniform Learning and No-Free Lunch Theorems
- Online Learning, Online Optimization and Online Regret
- The Perceptron Rule and Online Gradient Descent
- Experts and the Winnow Rule
- Bregman Divergence and Online Mirror Descent
- Online to Batch Conversion

- Computational Lower Bounds:
- Computational Hardness of Proper Learning
- Cryptographic Hardness of Learning

- Additional Topics
- Stability Based Analysis
- Boosting: Weak Learning and the Margin Interpretation of Boosting.

- Required Reading: Some material will be covered through assigned reading, rather than in-class.
- Grading: Your grade will be based on the final exam and homeworks (they contribute equally to the final grade).
- Problem Sets: About 4-5 problem sets, some of which introducing additional material beyond the material covered in class. Some problem sets will also include optional extra-credit problems.
- Final Exam: The final exam will mostly consist of claims where you will be required to either affirm the claim is true (without providing a proof), or provide a counterexample to the claim.

- Shai Shalev-Shwartz and Shai Ben-David.
**Understanding Machine Learning: From Theory to Algorithms.**Cambridge University Press, 2014. - O. Bousquet, S. Boucheron, and G. Lugosi.
**"Introduction to statistical learning theory."**Advanced Lectures on Machine Learning, pp. 169-207. Springer Berlin Heidelberg, 2004. - Michael J. Kearns and Umesh Virkumar Vazirani.
**An Introduction to Computational Learning Theory.**MIT Press, 1994.

Week of | Monday | Wednesday | Friday |
---|---|---|---|

March 30th | What is Learning?* | PAC Learning and VC Theory I | No class |

April 6th | PAC Learning and VC Theory II | MDL and PAC-Bayes | No class |

April 13th | No class | Computational Complexity of Learning | Proper vs Improper Learning |

April 20th | No class | Hardness of Improper Learning Agnostic Learning |
Boosting Compression Schemes |

April 27th | No class | Real-Valued Loss | Scale-Sensitive Classes |

May 4th | No class | SVMs, L1 Regularization | No class |

May 11th | No class | L1 margin, Stability | Online Learning |

May 18th | Online Gradient Descent | No class | Strong Convexity Online Dual Averaging |

May 25th | No class | Mirror Descent Online to Batch |
No class |

June 1st | Stochastic Optimization Neural Networks |
Course Summary Nearest Neighbor Classificaiton |
No class |

June 8th | Final Exam: Tuesday June 9, 10:30am |

*The shaded slides are not required material.

Problem Set 2 (due May 1st)

Problem Set 3+4 (due June 5th)