# Introduction to Online Learning

### CSC 482A/581A, Spring 2023

Lectures: Mondays and Thursdays 1pm - 2:30pm, CLE B415 (note the room change!)
Instructor: Nishant Mehta. Office hours: TBD

There is no required textbook for this course, but the following are the most related textbooks and monographs:

##### **Project** - due by 11:59pm PDT, Friday April 14th**

Suppose that for each day in a sequence of days, a learning agent observes the weather predictions of $$K$$ experts (like "30% rain" or "90% rain", etc.) and then must aggregate these predictions somehow to give its own prediction (like the probability forecast "40% rain"). At the end of each day, the agent observes the actual weather outcome (like "rain"), and depending on the accuracy of its prediction, it suffers some error. The learning agent wishes that, at the end of the sequence of days, its cumulative error is not much larger than the cumulative error of the best expert. What algorithmic strategy might the agent use to decide its prediction each day? This type of question is precisely what we study in online learning.
 Date Topics Lecture Notes and Additional reading 1/9, 1/12, and 1/16 Introduction, Examples of online learning problems Prediction with expert advice: exponential weights with convex, exp-concave, and mixable losses Lecture 1 1/19 and 1/23 Online classification, Halving algorithm, Littlestone dimension, SOA, "Agnostic SOA" Lecture 2 1/26, 1/30, and 2/2 Online convex optimization, OGD Lecture 3 2/6 and 2/9 Stochastic online learning and SCO: stochastic gradient descent; online-to-batch conversion ($$\mathbb{E}$$ and WHP) Lecture 4 2/9, 2/13, and 2/16 OCO foundations I: Follow the Regularized Leader Lecture 5 2/20 and 2/23 READING BREAK (NO CLASS) 2/27 Applications of FTRL 3/2 and 3/6 OCO foundations II: Online Mirror Descent Lecture 6 3/9, 3/13, and 3/16 Game theory: No-regret learning for approximate equilibria, Optimistic HedgeImproved regret under gradual variations Lecture 7 3/20 and 3/23 A local norm bound for FTRL First-order regret for Hedge via local norm analysis Adversarial multi-armed bandits: EXP3 Lecture 8 3/27 EXP3 with local norm analysis Stochastic multi-armed bandits: UCB 3/30 Tsallis-INF 4/3 and 4/6 Project presentations