Machine Learning Theory

CSC 431/531, Spring 2025

Lectures: Mondays and Thursdays 1pm - 2:20pm, Cornett Building A221
Instructor: Nishant Mehta. Office hours: ECS 608, Wednesdays and Thursdays, 4pm - 5pm
TAs: Quan Nguyen (manhquan233 [at] gmail.com)
        Ali Mortazavi (<firstname>themorty@gmail)

Recommended textbooks:



About the course
Given an i.i.d. sample of data points with accompanying labels, how well can a learning algorithm predict the labels of future data? How well can a learning algorithm perform if it instead predicts the label of each data point given all the previously observed labeled data points, and moreover, the sequence of data points might be devised by some adversary rather than being an i.i.d. sample? The first question lies at the heart of statistical learning theory, while the second comes from the area of online learning.

In this course we will look at core techniques for analyzing the performance of machine learning algorithms for problems in statistical learning, as well as the design aspects of good machine learning methods. A smaller but significant component of this course will focus on various online learning settings, including prediction with expert advice and online convex optimization.

In the schedule below, any information about future lectures is just a rough guide and might change.

Lectures
Date Topics Lecture Notes Additional reading
1/6 Introduction Lecture 1
1/9 Mistake bound model, Perceptron Lecture 2
1/13 Perceptron continued, Consistency model Lecture 3 (SSBD): Section 3.1
1/16 PAC model, Connections between learning models Lecture 4
1/20 Agnostic learning I: Uniform convergence for finite classes Lecture 5 Intro to statistical learning theory (Boucheron, Bousquet, Lugosi).
(SSBD): Chapter 4
(SSBD): Chapter 5 for No-Free-Lunch theorem
1/23 Agnostic learning II: Uniform convergence for infinite classes via the Growth Function Lecture 6
1/27 Agnostic learning III: Uniform convergence for infinite classes via the VC dimension Lectures 7 & 8
1/30 Agnostic learning IV: Examples of VC classes
2/3 Compression schemes Lecture 9 (SSBD): Chapter 30
2/6 Rademacher complexity Lectures 10–12
2/10 Bounds based on Rademacher complexity
2/13 Properties of Rademacher complexity
2/24 Support vector machines, margin bounds
2/27 Littlestone dimension (SSBD) Section 21.1.1
3/3 Decision-theoretic online learning, Hedge
3/6 Prediction with expert advice, Exponentially weighted average forecaster,
Exp-concave losses
3/10 Game theory (approximating Nash equilibria)
3/13 Agnostic online classification, Standard optimal algorithm