Machine Learning Theory (CS 6783) Course Webpage

Machine Learning Theory (CS 6783)

News :

Lecture 25 pdf posted.
Lecture 24 pdf posted.
Lecture 23 pdf posted.
Lecture 22 pdf posted.
Lecture 21 pdf posted.
Lecture 20 pdf posted.
Lecture 19 pdf posted.
Lecture 18 pdf posted.
Lecture 17 pdf posted.
Lecture 16 pdf posted.
Lecture 15 pdf posted.
Lecture 14 pdf posted.
Lecture 13 pdf posted.
Lecture 12 pdf posted.
Lecture 11 pdf posted.
HW 2 is out, due on oct 7th..
Lecture 10 pdf posted.
Lecture 9 pdf posted.
Lecture 8 pdf posted.
Lecture 7 pdf posted.
Lecture 6 pdf posted.
Lecture 5 pdf posted.
Office Hours Fridays, 2-3pm
Lecture 4 pdf posted.
Homework 1 is out. Due Sep 11th. Available through CMS. If you are having trouble finding it, contact me. Sumit assignment in latex format via cms or via email.
Course added on Piazza please join
Lecture 3 pdf posted.
Homework 0 : correction on problem 1 posted.
Homework 0 posted : not for grade no need to submit
Lecture 1 slides available

Location and Time :

Location : Upson, 5130
Time : Tue-Thu 1:25 PM to 2:40 PM
Office Hours : Fridays, 3-4pm

Description :

We will discuss both classical results and recent advances in both statistical (iid batch) and online learning theory. We will also touch upon results in computational learning theory. The course aims at providing students with tools and techniques to understand inherent mathematical complexities of learning problems, to analyze and prove performance gaurantees for machine learning methods and to develop theoretically sound learning algorithms.

Pre-requisite :

Student requires a basic level of mathematical maturity and ease/familiarity with theorems and proofs style material. Familiarity with probability theory, basics of algorithms and an introductory course on Machine Learning (CS 4780 or equivalent) are required. M.Eng. and undergraduate students require permission of instructor.

Tentative topics :

Assignments : There will be a total of 6 assignments, 5 that count towards your grade (Homework 0 is doesn'c count towards your grade and need not be submitted).

Homework 0 : [pdf] (don't submit, not for grade)
Homework 1 : [via cms] (submit solutions via cms or email)

Term project :

Lectures :

Lecture 1 : Introduction, course details, what is learning theory, learning frameworks [slides]
Reference : [1] (ch 1 and 3)
Lecture 2 : Recap, no free lunch theorem, minimax formulations for PAC (realizable) setting, non-parametric regression (well specified case), statistical learning and online learning. Relationship between the corresponding minimax rates.
Reference : [1] (ch 5, only sections : 5.2 (no universal data compression or online convex optimization was covered), Section 5.3.1)
Lecture 3 : Statistical learning, Empirical risk minimization, Minimax rates and uniform convergence, Finite class case, Mdl bound, Uniform Vs Universal rates
[lec 3]
Lecture 4 : Symmetrization, Rademacher complexity, examples, Binary classification and growth function.
[lec 4]
Lecture 5 : binary classifciation in statistical lerarning setting, growth function, VC dimension, VC/Sauer/Shelah Lemma, Charecterization of learnability for binary classification problems in statistical learning framework.
[lec 5]
Lecture 6 : VC dimension review + continued.
[lec 6]
Lecture 7 : Properties of Rademacher complexity.
[lec 7]
Lecture 8 : Covering numbers and Pollard bound
[lec 8]
Lecture 9 : Dudley chaining
[lec 9]
Lecture 10 : Fat shattering dimension
[lec 10]
Lecture 11 : Supervised learnability, Intro to Online Learning
[lec 11]
Lecture 12 : Online Learning, experts algorithm, bit prediction
[lec 12]
Lecture 13 : Online Learning, minimax rate, sequential Rademacher complexity
[lec 13]
Lecture 14 : Sequential Rademacher complexity
[lec 14]
Lecture 15 : Sequential Rademacher complexity, Properties
[lec 15]
Lecture 16 : Sequential Covers and Fat-shattering Dimension
[lec 16]
Lecture 17 : Online Convex Optimization
[lec 17]
Lecture 18 : Online Mirror Descent
[lec 18]
Lecture 19 : Online Mirror Descent Contd.
[lec 19]
Lecture 20 : Wrapping up MD, General Learning Algorithms
[lec 20]
Lecture 21 : Relaxations
[lec 21]
Lecture 22 : Relaxations, deriving algorithms
[lec 22]
Lecture 23 : Relaxations, deriving algorithms
[lec 23]
Lecture 23 : Relaxations, deriving randomized algorithms
[lec 24]
Lecture 23 : Relaxations, deriving randomized
[lec 25]

Reference Material : (more references and specific links will be added as we go)

Statistical Learning and Sequential Prediction, A. Rakhlin and K. Sridharan [pdf]
Introduction to Statistical Learning Theory, O. Bousquet, S. Boucheron, and G. Lugosi [pdf]
Prediction Learning and Games, N. Cesa-Bianchi and G. Lugosi [link]
Understanding Machine Learning From Theory to Algorithms, S. Ben David and S. Shalev-Shwartz [link]
Introduction to Online Convex Optimization, Elad Hazan [link]
Concentration inequalities, S. Boucheron, O. Bousquet, and G. Lugosi [pdf]
A Gentle Introduction to Concentration Inequalities, K. Sridharan [pdf]
On the Vapnik-Chervonenkis-Sauer Lemma by Leon Bottou [link]