Machine Learning Theory (CS 6783) Course Webpage

Machine Learning Theory (CS 6783)

News :

Homewaork 1 is out, due sep 16th!
Course added to Piazza, please join.
Welcome to first day of class!

Location and Time :

Location : Hollister Hall, 306
Time : Tue-Thu 1:25 PM to 2:40 PM
Office Hours : Fridays, 2-3pm

Description :

We will discuss both classical results and recent advances in both statistical (iid batch) and online learning theory. We will also touch upon results in computational learning theory. The course aims at providing students with tools and techniques to understand inherent mathematical complexities of learning problems, to analyze and prove performance gaurantees for machine learning methods and to develop theoretically sound learning algorithms.

Pre-requisite :

Student requires a basic level of mathematical maturity and ease/familiarity with theorems and proofs style material. Familiarity with probability theory, basics of algorithms and an introductory course on Machine Learning (CS 4780 or equivalent) are required. M.Eng. and undergraduate students require permission of instructor.

Tentative topics :

Assignments : There will be a total of 5 assignments covering 55% of your grade.

Term project :

Project proposal/approval of project due on oct 15th.
Lectures :

Lecture 1 : Introduction, course details, what is learning theory, learning frameworks [slides]
Reference : [1] (ch 1 and 3)
Lecture 2 : Minimax Rates, comparing the different frameworks [lec2]
Lecture 3 : No Free Lunch, Statistical Learning [lec3]
Lecture 4 : Statistical Learning: ERM ,Finite classes, MDL [lec4]
Lecture 5 : Statistical Learning: MDL continued, infinite classes [lec5]
Lecture 6 : Statistical Learning: Symmetrization, Rademacher Complexity, Growth function [lec6]
Lecture 7 : Statistical Learning: Growth function, VC Dimension, Massart's finite lemma [lec7]
Lecture 8 : Statistical Learning VC Sauer Shelah Lemma continued [lec7]
Lecture 9 : Statistical Learning: Properties of Rademacher complexity, examples [lec9]
Lecture 10 : Statistical Learning: Examples continued, Covering numbers, Pollard bound [lec10]
Lecture 11 : Statistical Learning: Covering numbers, Pollard bound, Dudley Bound [lec11]
Lecture 12 : Statistical Learning: Covering numbers, fat-shattering dimension, learnability [lec12]
Lecture 13 : Online Learning: Halving, Exponential weights, minimax rate for bit prediction [lec13]
Lecture 14 : Online Convex Learning: Online Gradient Descent [lec14]
Lecture 15 : Online Mirror Descent [lec15]
Lecture 16 : Online Mirror Descent [lec16]
Lecture 17 : Online Mirror Descent, continued. [lec17]
Lecture 18 : Minimax Rates for Online Learning [lec18]
Lecture 19 : Minimax Rates for Online Learning, Sequential Rademacher Complexity [lec19]
Lecture 20 :Sequential Rademacher Complexity [lec20]
Lecture 21 :Sequential Complexity Measures [lec21]
Lecture 22: Relaxations for Online Learning [lec22]
Lecture 23: Relaxations for Online Learning [lec23]
Lecture 24: Relaxations for Randomized Algorithms [lec24]

Reference Material : (more references and specific links will be added as we go)

Statistical Learning and Sequential Prediction, A. Rakhlin and K. Sridharan [pdf]
Introduction to Statistical Learning Theory, O. Bousquet, S. Boucheron, and G. Lugosi [pdf]
Prediction Learning and Games, N. Cesa-Bianchi and G. Lugosi [link]
Understanding Machine Learning From Theory to Algorithms, S. Ben David and S. Shalev-Shwartz [link]
Introduction to Online Convex Optimization, Elad Hazan [link]
Concentration inequalities, S. Boucheron, O. Bousquet, and G. Lugosi [pdf]
A Gentle Introduction to Concentration Inequalities, K. Sridharan [pdf]
On the Vapnik-Chervonenkis-Sauer Lemma by Leon Bottou [link]