Machine Learning Theory

CSC 431/531, Spring 2025

Lectures: Mondays and Thursdays 1pm - 2:20pm, Cornett Building A221
Instructor: Nishant Mehta. Office hours: ECS 608, Wednesdays and Thursdays, 4pm - 5pm
TAs: Quan Nguyen (manhquan233 [at] gmail.com)
        Ali Mortazavi (<firstname>themorty@gmail.com)

Recommended textbooks:


**Graduate student project**


About the course
Given an i.i.d. sample of data points with accompanying labels, how well can a learning algorithm predict the labels of future data? How well can a learning algorithm perform if it instead predicts the label of each data point given all the previously observed labeled data points, and moreover, the sequence of data points might be devised by some adversary rather than being an i.i.d. sample? The first question lies at the heart of statistical learning theory, while the second comes from the area of online learning.

In this course we will look at core techniques for analyzing the performance of machine learning algorithms for problems in statistical learning, as well as the design aspects of good machine learning methods. A smaller but significant component of this course will focus on various online learning settings, including prediction with expert advice and online convex optimization.

In the schedule below, any information about future lectures is just a rough guide and might change.

Lectures
Date Topics Lecture Notes Additional reading
1/6 Introduction Lecture 1
1/9 Mistake bound model, Perceptron Lecture 2
1/13 Perceptron continued, Consistency model Lecture 3 (SSBD): Section 3.1
1/16 PAC model, Connections between learning models Lecture 4
1/20 Agnostic learning I: Uniform convergence for finite classes Lecture 5 Intro to statistical learning theory (Boucheron, Bousquet, Lugosi).
(SSBD): Chapter 4
(SSBD): Chapter 5 for No-Free-Lunch theorem
1/23 Agnostic learning II: Uniform convergence for infinite classes via the Growth Function Lecture 6
1/27 Agnostic learning III: Uniform convergence for infinite classes via the VC dimension Lectures 7 & 8
1/30 Agnostic learning IV: Examples of VC classes
2/3 Compression schemes Lecture 9 (SSBD): Chapter 30
2/6 Rademacher complexity Lectures 10–13
2/10 Bounds based on Rademacher complexity
2/13 Properties of Rademacher complexity
2/24 Properties of Rademacher complexity continued
2/27 Support vector machines, margin bounds Lecture 14
3/3 Prediction with expert advice, Exponentially weighted average forecaster Lectures 15 & 16 (+ 17)
3/6
3/10
Mixability, Exp-concave losses
Decision-theoretic online learning; Hedge
3/13 Game theory (approximating Nash equilibria) Lecture 18 (+19)
3/17 Online Convex Optimization, Online Gradient Descent Lecture 19 (+20)
3/20 Online Gradient Descent continued
Online-to-batch conversion
Lecture 20
3/24 Non-stochastic multi-armed bandits, EXP3 Lectures 21 & 22
3/27 Analysis of EXP3
3/31 Project presentations
4/3 Project presentations