Data Mining

CSC 503/SENG 474, Spring 2025

Lectures: Tuesdays, Wednesdays, and Fridays 11:30am - 12:20pm, BWC A104
Instructor: Nishant Mehta
TAs: Ali Mortazavi (<firstname>themorty@gmail),
        Mohamed Mouhajir

Labs: Wednesdays and Thursdays in ELW B215

Nishant's office hours: Wednesdays and Thursdays, 4pm - 5pm

Textbooks:


           **Information about the Project**


What this course is about
This course is an introduction to Data Mining/Machine Learning, a sub-field of artificial intelligence that is all about how algorithms can use experience to improve their performance on tasks. This course will introduce you to many foundational machine learning methods and give you both a theoretical grounding as well as ample practical experience in implementing and using these methods on real data.
The objective of this course is to give students a foundation in machine learning, including important problems like classification, regression, clustering, and dimension reduction. The emphasis will be on understanding the design of various machine learning methods, learning how to use them in practice, and learning principled ways to evaluate their performance. The (optional) labs will complement the lecture topics by offering practical experience in experimenting with machine learning methods. The assignments will revolve around implementing machine learning algorithms and analyzing their results on data, with most of the emphasis on the analysis. Assignments might also involve some theoretical component (especially for graduate students).

In the schedule below, any information about future lectures is just a rough guide and might change.

Readings are required unless indicated as optional. The lectures supplement the readings, and to do well in this course (and learn machine learning) you should do the readings and attend the lectures. Some readings are marked as optional. In many cases, this is because they are more advanced; you are always welcome to ask the instructor questions about reading material, either via discussion forum (Ed Discussion; see Brightspace for the signup link) or office hours (or email if needed).

Lectures
Date Topics Lecture Slides/Notes Reading
1/7 Introduction Lecture 1: slides (Mitchell) Chapter 1
(Murphy) Chapter 1 (optional)
1/8 Decision Trees I Lecture 2: slides (Mitchell) Chapter 3
1/10 Decision Trees II
1/14 Random Forests Lecture 4: slides Random Forests chapter of ESL (optional) - reading guide
1/15 Evaluation and Model Selection Lecture 5: slides (Mitchell) Chapter 5
1/17 Boosting Lecture 6: slides Boosting book - Chapter 1 (optional)
1/21 Neural Networks I: Intro Lectures 7–11: slides (Mitchell) Chapter 4
(Murphy) Chapter 13 (optional) - reading guide
1/22 Neural Networks II: Linear separators
1/24 Neural Networks III: Perceptron, Gradient descent, SGD
1/28 Neural Networks IV: Sigmoid units,
Multi-layer networks, Backprop
1/29 Neural Networks V: Reducing overfitting
1/31 SVMs I: Large margin separation, Soft-margin SVM, learning with kernels Lectures 12–14: slides SVM tutorial - reading guide
Andrew Ng's SVM lecture notes (optional)
2/4 Snow day ☃ - no class!
2/5 SVMs II: Soft-margin SVM, learning with kernels
2/7 Midterm
2/11 Learning with kernels
2/12 Probability Review
Maximum Likelihood Estimation
Lecture 15: notes (Spring 2021) Estimating Probabilities: MLE and MAP
(Murphy) Chapter 4 (optional) - reading guide
2/14 MAP Estimation (including MDL) Lecture 16: slides (Mitchell) Section 6.6: MDL Principle
Reading Break
2/25 Naive Bayes Generative and Discriminative Classifiers:
Naive Bayes and Logistic Regression

(Murphy) Chapters 9 and 10 (optional) - reading guide