Course Description
An introductory course in machine learning, with a focus on data modeling and related methods and learning algorithms for data sciences. Tentative topic list:
- Dimensionality reduction, such as principal component analysis (PCA) and the singular value decomposition (SVD), canonical correlation analysis (CCA), independent component analysis (ICA), compressed sensing, random projection, the information bottleneck. (We expect to cover some, but probably not all, of these topics).
- Clustering, such as k-means, Gaussian mixture models, the expectation-maximization (EM) algorithm, link-based clustering. (We do not expect to cover hierarchical or spectral clustering.).
- Probabilistic-modeling topics such as graphical models, latent-variable models, inference (e.g., belief propagation), parameter learning.
- Regression will be covered if time permits.
Can be taken independently or in any order with CS4780/5780 (Machine Learning for Intelligent Systems).
Prerequisites: probability theory (BTRY 3080, ECON 3130, MATH 4710, or strong performance in ENGRD 2700 or equivalent); linear algebra (MATH 2940 or equivalent); CS2110 or equivalent programming proficiency.
News (see also announcements on lecture handouts)
- Tuesday March 31st: Competition I: clustering challenge data set is out! (click here) Instructions on what to do with the data will only be posted after the spring break.
- Tuesday March 21st: Assignment A2 out on March 19th due Friday, March 27th 5:00 pm.
- Tuesday March 17th: the final project (competition) due date is, as set by the registrar, Monday May 11th at 4:30 pm.
- Wednesday February 18th: Watch the assignments page for minor updates of A1. Crucial updates will be announced via email using CMS.
- Friday February 13th: Assignment A1 out, due on March 3rd, 11:59PM via CMS. Assignment on PCA, CCA and random projection with some fun examples/data that you can download here!
- Friday February 13th: All A0s we have received are now in the Gates 216 homework handback room. We will check which A0s have been picked up sometime during the week of February 23rd and assign credit for A0 based on that information.
- Wednesday February 11: There will be no office hours during the February break except those held by Mevlana Gemici (if students let him know by email beforehand that they expect to come) and Jack Hessel; see the "Info" page for exact times and locations.
- Tuesday February 10: No lecture on Feb 12 due to A1 preparation by profs; the lecture of Feb 19 is an optional but recommended review session, planned with an eye toward A1.
- Wednesday, February 4: A0s may be collected from the Gates Hall homework handback room, Gates 216, hours MTWF 12:00pm - 4:00pm and R 12:30 - 4pm. Bring Cornell ID.
- Monday, February (already!) 2: students who turned in an A0 with their netid written on it have been added to the course CMS, and are assumed to be taking the course. To receive full credit on the assignment, you will need to pick your commented-upon A0 at the end of class on Tuesday, February 3 [or via other arrangements, to be announced soon.]
 If you did not turn in an A0, we presume you are not taking the course for credit. If this is not the case, please let us know what your situation is.
- Friday, January 23: clarification regarding CMS: we won't be adding any students to it until after HW0 is due, and then we'll do a big semi-auto-enroll from the registrar's records and what gets handed in. So, no need to ask us to add you to CMS. Sorry for not letting you know this earlier!
- Thursday, January 22: The first question has been asked and answered on the course Piazza page. We encourage you to use this resource to ask and answer questions.
- Thursday, January 22: The first assignment is out!
We apologize in advance for the inconvenience
This is a new course, covering topics for which there is no pre-existing canonical material (and hence no required textbook). Hence, we must concentrate our efforts this semester on developing the course content. We are really excited to do this work! But it does mean that some of the administrative stuff will have some rough edges. We can only ask for your patience, and promise that we will do the best we can. We are looking forward to having you with us on this adventure!