Data Mining

CSC 503, Summer 2026

Lectures: Mondays and Thursdays 4:30pm - 5:50pm, COR A225
Instructor: Nishant Mehta
TAs: Ali Mortazavi (<firstname>themorty@gmail),
        Erika Rumbold (<firstinitial><lastname>@uvic)

Labs: Wednesdays 1:30pm - 2:20pm and Thursdays 9am - 9:50am in ELW B215

Nishant's office hours: Tuesdays, 3:30pm - 5:30pm, ECS 608

Textbooks:

Official course outline: CSC 503

           **Information about the Project**


What this course is about
This course is an introduction to Data Mining/Machine Learning, a sub-field of artificial intelligence that is all about how algorithms can use experience to improve their performance on tasks. This course will introduce you to many foundational machine learning methods and give you both a theoretical grounding as well as ample practical experience in implementing and using these methods on real data.
The objective of this course is to give students a foundation in machine learning, including important problems like classification, regression, clustering, and dimension reduction. The emphasis will be on understanding the design of various machine learning methods, learning how to use them in practice, and learning principled ways to evaluate their performance. The (optional) labs will complement the lecture topics by offering practical experience in experimenting with machine learning methods. The assignments will revolve around implementing machine learning algorithms and analyzing their results on data, with most of the emphasis on the analysis. Assignments might also involve some theoretical component (especially for graduate students).

In the schedule below, any information about future lectures is just a rough guide and might change.

Readings are required unless indicated as optional. The lectures supplement the readings, and to do well in this course (and learn machine learning) you should do the readings and attend the lectures. Some readings are marked as optional. In many cases, this is because they are more advanced; you are always welcome to ask the instructor questions about reading material, either via discussion forum (Ed Discussion; I'll post a signup link on Brightspace soon) or office hours (or email if needed).

Lectures
Date Topics Lecture Slides/Notes Reading
5/11 Introduction
Decision Trees I
Lecture 1: slides
Lectures 1–2: slides
(Mitchell) Chapter 1
(Murphy) Chapter 1 (optional)
5/14 Decision Trees II (Mitchell) Chapter 3
5/18 Victoria Day - no face-to-face lecture! Random Forests chapter of ESL (optional) - reading guide
5/21 Random forests
Evaluation and Model Selection
(Mitchell) Chapter 5
5/25 Boosting Boosting book - Chapter 1 (optional)
5/28 Neural Networks I: Intro, Linear separators (Mitchell) Chapter 4
(Murphy) Chapter 13 (optional) - reading guide
6/1 Neural Networks II: Perceptron, Gradient descent
6/4 Neural Networks III: SGD, Sigmoid units
Multi-layer networks, Backprop, Reducing overfitting
6/8 Midterm 1
6/11 SVMs I: Large margin separation, Soft-margin SVM SVM tutorial - reading guide
Andrew Ng's SVM lecture notes (optional)
6/15 SVMs II: Soft-margin SVM, Learning with kernels
 — Non face-to-face lecture (watch before end of reading break)
Dimension Reduction/Feature Transformation: PCA
Watch recorded lecture (video will be below later)
PCA: video pending      slides pending
Jonathon Shlens's PCA tutorial (Sections I through V)
6/18 Probability Review
Maximum Likelihood Estimation
Estimating Probabilities: MLE and MAP
(Murphy) Chapter 4 (optional) - reading guide
6/22 MAP Estimation (Mitchell) Section 6.6: MDL Principle
6/25 Naive Bayes Generative and Discriminative Classifiers:
Naive Bayes and Logistic Regression

(Murphy) Chapters 9 and 10 (optional) - reading guide
6/29 Logistic Regression
Learning Theory I: PAC Learning
Reading Break
7/6 Learning Theory II: PAC Learning continued, Agnostic Learning, VC dimension (Mitchell) Chapter 7 (up to and including Section 7.4.3)
7/9 Clustering I: K-means problem
7/13 Clustering II: K-means problem continued, Hierarchical clustering
Instance-based Learning I: k-NN and recommender systems
7/16 Midterm 2
7/20 Instance-based Learning II: k-NN and recommender systems (Mitchell) Chapter 8 (Sections 8.1 and 8.2)
7/23 Gaussian mixture models and EM (Murphy, 2012) Chapter 11 - reading guide
7/27 Project Presentations (in class)
7/30 Project Presentations (in class)