Introduction to Online Learning

CSC 482A/581A, Spring 2023

Lectures: Mondays and Thursdays 1pm - 2:30pm, CLE B415 (note the room change!)
Instructor: Nishant Mehta. Office hours: TBD

There is no required textbook for this course, but the following are the most related textbooks and monographs:

    **Project** - due by 11:59pm PDT, Friday April 14th**

About the course

This course is an introduction to the foundations of online learning, also known as sequential prediction. Online learning, together with its sibling online convex optimization, has mostly developed in the last 25 years, with rapid progress over the last 10 years. The area builds upon fundamental algorithmic ideas and mathematical foundations dating as far back to the 1950's. What is online learning? Well, let's see an example:
    Suppose that for each day in a sequence of days, a learning agent observes the weather predictions of \(K\) experts (like "30% rain" or "90% rain", etc.) and then must aggregate these predictions somehow to give its own prediction (like the probability forecast "40% rain"). At the end of each day, the agent observes the actual weather outcome (like "rain"), and depending on the accuracy of its prediction, it suffers some error. The learning agent wishes that, at the end of the sequence of days, its cumulative error is not much larger than the cumulative error of the best expert. What algorithmic strategy might the agent use to decide its prediction each day? This type of question is precisely what we study in online learning.

Along the way to learning about foundational algorithms in online learning and online convex optimization, we will learn fundamental concepts from convex analysis and probability. We will begin by seeing algorithms in the area, and then we will see how they are derived as well as how to prove guarantees on their performance. We also will see applications of online learning to other areas, such as (standard) convex optimization, finding approximate Nash equilibria for two-player zero-sum games, strategies for rapidly growing wealth (portfolio selection), and (maybe!) strategies for data compression.

In the schedule below, any information about future lectures is just a rough guide and might change.

Date Topics Lecture Notes and Additional reading
1/9, 1/12, and 1/16 Introduction, Examples of online learning problems
Prediction with expert advice: exponential weights with convex, exp-concave, and mixable losses
Lecture 1
1/19 and 1/23 Online classification, Halving algorithm, Littlestone dimension, SOA, "Agnostic SOA" Lecture 2
1/26, 1/30, and 2/2 Online convex optimization, OGD Lecture 3
2/6 and 2/9 Stochastic online learning and SCO: stochastic gradient descent; online-to-batch conversion (\(\mathbb{E}\) and WHP) Lecture 4
2/9, 2/13, and 2/16 OCO foundations I: Follow the Regularized Leader Lecture 5
2/20 and 2/23 READING BREAK (NO CLASS)
2/27 Applications of FTRL
3/2 and 3/6 OCO foundations II: Online Mirror Descent Lecture 6
3/9, 3/13, and 3/16 Game theory: No-regret learning for approximate equilibria, Optimistic Hedge
Improved regret under gradual variations
Lecture 7
3/20 and 3/23 A local norm bound for FTRL
First-order regret for Hedge via local norm analysis
Adversarial multi-armed bandits: EXP3
Lecture 8
3/27 EXP3 with local norm analysis
Stochastic multi-armed bandits: UCB
3/30 Tsallis-INF
4/3 and 4/6 Project presentations