Logistical information

CS 6241, Fall 2023: Numerical Methods for Data Science
Lecture time: TR 2:55-4:10
Lecture location: Hollister 320

Prof: David Bindel
Email: bindel@cornell.edu
OH: Tue 10-12, or by appointment.

Course description

This is a graduate level course on numerical methods prominent in modern data analysis and machine learning. Students must have a strong grounding in linear algebra and probability as well as sufficient mathematical maturity. Prior experience with numerical methods at the level of CS 4210/4220 or CS 6210/6220 will be highly useful, though not strictly required. The course will consist of six units of about two weeks each:

  • Least squares and regression
  • Low-rank factorizations for matrix and tensor data
  • Low-dimensional structure in function approximation
  • Kernel interpolation and Gaussian processes
  • Numerical methods for graph data analysis
  • Methods for learning models of dynamics

We will pay particular attention throughout to sparsity, rank structure, and spectral behavior of underlying linear algebra problems; convergence behavior and “regularization via iteration” effects for standard solvers; and comparisons between numerical methods for data analysis with large-scale numerical methods used in other areas of science and engineering.

Course work

Notes and readings will be posted on the course web page. We recommend reading the notes prior to the meeting.

Class meetings will typically consist of an ice-breaker or open question, or followed by lecture and full-class discussion. Other than activities to help students keep on track with the readings, the main course activity will be a course research project.

Course technology

The public course web page will be used for all activities that can readily be shared with the world. Otherwise, we will use Canvas for assignments, together with integrations for discussion (Ed Discussion). We will use the Julia programming language for our code examples, but you are welcome to use other languages as you prefer.

Course policies

Grading

Graded work will be 50% a term research project, 45% participation, and 5% course feedback and evaluations. On a 200 point scale, this will consist of:

Research project

After an initial reaction paper (done individually), students will work in groups of 1-3 on a term-length research project. Part of the credit will involve participating in peer review of contributions from two other groups. The parts of this project will include:

  • Reaction paper involving reading and critique of at least two papers - 10 points
  • Project proposal for 1-3 people, including pointers to related work and a plan for how team members will work together - 10 points
  • Short progress report - 5 points
  • Draft report - 10 points
  • Peer review of two other draft reports - 10 points each
  • Final report - 40 points

Participation and feedback

After the first lecture where we lay out this syllabus, there will be 25 additional lectures. For each lecture, there will be a class notebook that includes lecture materials and (at least) three related short problems. We ask for a total of 90 points of “participation work,” which may include

  • Problems from the class notebook (2 points each)
  • Providing a (solved) question suitable for homework (2 points each)
  • Providing materials equivalent to one lecture on a topic of interest (20 points, must be discussed with the professor in advance)

We will provide detailed feedback on some class notebook problems, but for many we will grade based on participation and provide detailed feedback only on request.

Collaboration

Collaboration in this class is explicitly encouraged, as is reference to the research literature. You can and should work together, with co-authorship for group assignments.

An assignment is an academic document, like a journal article. When you turn it in, you are claiming everything in it is your original work, unless you cite a source for it.

If you get an idea from a classmate, the professor, a book or other published source, or elsewhere, please provide an appropriate citation. This is not only critical to maintaining academic integrity, but it is also an important way for you to give credit to those who have helped you out. When in doubt, cite! Code or writeups with appropriate citations will never be considered a violation of academic integrity in this class (though you will not receive credit for code or writeups that were shared when you should have done them yourself).

For more information, see Cornell’s Code of Academic Integrity.

Emergency procedures

In the event of a major campus emergency, course requirements, deadlines, and grading percentages are subject to changes that may be necessitated by a revised semester calendar or other circumstances. Any such announcements will be posted to Canvas and the course home page.