A Probabilistic Approach to Machine Learning

Instructor: Nati Srebro

**
**

Pre-requisites:

- An introductory class to machine learning, such as "CMSC35400 Machine Learning" or "CMSC35420/TTIC103 Statistical Methods in AI".
- Probability

Topics may include:

- Introduction to Bayesian inference.
- Conjugate priors. The binomial, multinomial and Gaussian distributions.
- Discriminative probabilistic models and the conditional likelihood.
- Approximate inference, including Laplace approximation and variational methods.
- Sampling techniques, including Gibbs Sampling and general MCMC.
- The EM Algorithm as a variational technique.
- Topic / latent variable models.
- Nonparametric Bayesian models.
- Gaussian processes and their relationship to SVM/Kernel methods.
- The Dirichlet process: clustering and nonparametric topic models.
- Neural Networks as Inference.
- Boltzmann Machines and Deep Belief Networks.

- David MacKay: Information Theory, Inference and Learning Algorithms.
- The entire book, as well as some extra material, is available online (of course, you can also purchase a hardbound physical version of the book).

- Carl Rasmussen and Christopher Williams: Gaussian Processes for Machine Learning.
- This book is also available online or for purchase.
- Radford Neal: Probabilistic Inference Using Markov Chain Monte Carlo Methods
- An excellent detailed survey of MCMC sampling techniques---available (only) online.
- Andrew Gelman, John Carlin, Hal Stern and Donald Rubin: Bayesian Data Analysis, 2nd Edition.
- Available for purchase, but unfortunately not available online.
- Michael Jordan: An Introduction to Probabilistic Graphical Models.
- This book is not yet available. Hardcopies of relevant chapters will be provided to students attending the class.

- Thursday October 2nd
- What is learning? Importance of prior knowledge to learning. Representing prior knowledge via a probability distribution.
- Bayesian inference and "inverse probability" calculations. The prior, likelihood, posterior and evidence.

- Tuesday October 7th
- Bayesian inference. The posterior distribution of the parameters and over future events. The MAP and Maximum Likelihood estimates of the parameters.
- The Bayesian evidence and its use in model comparison.
*Reading: MacKay Sections 2.1-2.3, Chapter 3*- Exchangeability and de Finetti's Theorem.
- What is a Conjugate Prior?
- The Beta distribution: definition, properties, calculations.

- Thursday October 9th: no lecture
- Friday October 10th:
- Supervised learning using the Naive Bayes model.
- The Posterior Mean parameter estimate and its relationship to the full posterior in the Naive Bayes model.

- Tuesday October 14th: no lecture.
- Thursday October 16th
- Model selection in the Naive Bayes model.
- The relationship between the Naive Bayes model and Logistic Linear Regression.
- A Probabilistic Model for Logistic Linear Regression
- Generative vs. Discriminative Learning

- Friday October 17th
- Conditional Independence and Factorization in Directed Graphical Models.
- The "Bayes Ball" algorithm.
*Reading: Jordan Section 2.1***Problem Set 1 out.**

- Tuesday October 21st
- From Linear Prediction to Gaussian Processes.
- Gaussian Processes for Regression.
*Reading: Rasmussen and Williams Sections 2.1-2.2.*

- Thursday October 23rd
**Problem set 1 recommended submission date**- Gaussian Processes for Classification.
*Reading: Rasmussen and Williams Sections 3.1-3.3*

- Tuesday October 28th
**Final date problem set 1 accepted**- Laplace Approximation.
*Reading: Rasmussen and Williams Section 3.4*

- Thursday October 30th
- Model Selection and the Laplace Approximation to the Bayesian Evidence.
- Relationship to Regularization and to SVMs.
*Reading: Rasmussen and Williams Sections 5.1-5.2,5.4-5.5,6.4*

- Tuesday November 4th
- Introduction to sampling techniques:
- Integration through sampling.
- Rejection sampling.
- Importance sampling.

- Introduction to Markov Chains and the MCMC technique:
- Sampling using a Markov chain.
- The stationary distribution.
- Ergodic and non-ergodic Markov chains.
- Detailed balance and reversible Markov chains.

*Reading: MacKay Sections 29.1-29.3,29.6**More detailed reading: Neal Chapter 3*

- Introduction to sampling techniques:
- Thursday November 6th
- The Metropolis MCMC.
- The Metropolis-Hastings MCMC.
- The Langevin Method.
- Gibbs Sampling.
*Reading: MacKay Sections 29.4-29.5,41.4**More detailed reading: Neal Chapter 4*

- Tuesday November 11th
- Calculating the Bayesian Evidence using MCMC.
- MCMC Sampling vs. Optimization.
- Simulated Annealing.
*Reading: Neal Section 6.1-6.2*- Multi-layered Feed-forward Networks.
- Neural Networks as Inference.
*Reading: MacKay Chapters 41,44 (and background from Chapters 38-39*

- Thursday November 13th
- Boltzman Machines.
*Reading: MacKay Chapters 42-43*

- Tuesday November 18th
- Restricted Boltzman Machines and Deep Belief Networks

- Thursday November 20th
- Latent Variables.
- Clustering Models.
- The EM Algorithm.

- Tuesday November 25th
- The Dirichlet Distribution
- Gibbs Sampling in Clustering Models.
**Problem Set 2 out.**

Last modified: Tue Nov 25 22:12:43 Central Standard Time 2008